Register today for our Generative AI Foundations course. Use code GenAI99 for a discount price of $99!
Skip to content

Feature Engineering and Data Prep – Still Needed?

It is a truism of machine learning and predictive analytics that 80% of an analyst’s time is consumed in cleaning and preparing the needed data. I saw an estimate by a Google engineer that 25% of the time was spent just looking for the right data. A big part of this process is human-driven featureContinue reading “Feature Engineering and Data Prep – Still Needed?”

Job Spotlight: Risk Analyst

Many jobs are centered around risk management. If you’re looking through job postings, of course, you’ll see lots of jobs whose purpose is to make sure that nothing bad happens – the equivalent of locking the doors and closing the windows. More interesting from a statistical perspective are the jobs that assume that bad thingsContinue reading “Job Spotlight: Risk Analyst”

Problem of the Week: The Value of Bedrooms

Question: You work for an internet real-estate company, building statistical models to predict home price on the basis of square footage, number of bedrooms, number of bathrooms, property type (single family home, townhouse, multiplex), and age. Surprisingly, you find the coefficient for bedrooms is negative, meaning that adding bedrooms decreases value. What might account forContinue reading “Problem of the Week: The Value of Bedrooms”

Statistically Significant – But Not True

If you are looking for the Feature Engineering blog post, you can find it here: https://www.statistics.com/feature-engineering-data-prep-still-needed/ In 2015, at an Alzheimer’s conference, Biogen researchers presented dramatic brain scans showing that the antibody aducanumab effectively cleared out plaque in the brain, plaque that was associated with Alzheimer’s disease. Their study involved 166 patients in a randomized,Continue reading “Statistically Significant – But Not True”

Book Review: Everyone Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We REALLY Are

This week’s book review is of Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are, Seth Stephens-Davidowitz’s fascinating book about how social media data reveals all sorts of things about us that we barely know ourselves. For example, did you know that the ages 8-12 areContinue reading “Book Review: Everyone Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We REALLY Are”

Industry Spotlight – Precision Agriculture

The application of analytics to agriculture has given rise to what is called “precision agriculture”, a science that seeks to take advantage of and use detailed information that is local in time and place. Tractors and farm equipment are being equipped with sensors and software that allow them to respond automatically to external data, andContinue reading “Industry Spotlight – Precision Agriculture”