What We're Collecting
| What We Capture | Why It Matters |
|---|---|
| Star Rating (1-5) | The obvious signal, but easily manipulated |
| Review Text | The real story - what did they love/hate? |
| Would Recommend? | Often more honest than star ratings |
| Helpful Votes | Crowd wisdom - which reviews are actually useful? |
| Reviewer Demographics | Skin type, skin tone, age - does it work for people like you? |
| Photos | Visual proof the reviewer actually used the product |
| Incentivized Flag | Did Sephora give them free product for this review? |
| Date Posted | Is this a recent opinion or from years ago? |
How We Get It
Sephora uses a service called BazaarVoice to power their review system. This service has an API (think of it like a data faucet) that we can tap into.
Product Discovery
Start with a list of all Sephora products (from their sitemap)
API Requests
For each product, request all its reviews from the BazaarVoice API
Handle Challenges
Rate limits, errors, and pagination (reviews come in batches of 100)
Storage
Save everything as compressed files for later processing