Tag: ai

  • The Economics of Rewarding Yourself

    The Economics of Rewarding Yourself

    I’m Shivani, a Senior Data Scientist who writes about building with AI, product thinking, and the occasional experiment on myself.

    If you’ve been anywhere near the tech world lately, you’ve heard “agentic AI” thrown around until it loses meaning. I was one of those people nodding along while feeling quietly overwhelmed. So I decided to stop reading about it and build something with it instead.

    It had to be something I actually cared about.

    Not a generic todo app. Not another chatbot.

    If there’s one app that genuinely makes me happy, it’s the Starbucks app. I barely spend time on it. However, when I open it, order my drink, and watch those stars land in my account, something lights up. And when I finally get to redeem them? That little moment of “I earned this” is disproportionately satisfying for what it is.

    That feeling is behavioral economics in action. I realized that if Starbucks can make me feel that way about a latte, then I could create the same loop for my own life. Not streaks. Not habit trackers that guilt you when you miss a day. Stars. Real rewards. A system that feels like a game you actually want to play.

    I divided life into six areas: Health, Work, Upskilling Prep, Mindfulness, and Joy & Connection. Each area had tasks. Each task had stars. Complete the task, earn the stars. Simple.

    My baseline assumption: 100 stars should feel like a great day.

    Then I did the math.

    Health → 60–80 stars possible
    Work → 30–70 stars possible
    Upskilling prep → 60–90 stars possible
    Joy & Connection → 25–40 stars possible
    Mindfulness → 25–50 stars possible
    Total possible → 200–300+ stars a day

    A coffee reward costing 100 stars becomes automatic. You hit it before lunch. By Wednesday you have enough for the “experience” tier. The stars stop feeling special . And when stars stop feeling special, the whole system collapses. That’s the Starbucks effect breaking down. Earning stars on the Starbucks app isn’t easy. That’s the point.

    But inflation wasn’t even the worst problem I found.

    I realized the system could reward fake productivity. You could avoid the hard, meaningful work entirely and stack easy tasks instead:

    Drink water → 15 stars
    Read 20 min → 25 stars
    Listen podcast → 15 stars
    Total → 55 stars, zero real progress

    It feels productive. It isn’t. The system was accidentally designed to let you game yourself.

    The fix required rethinking the whole architecture. Instead of asking ,”How many stars should this task earn?” , I started from the other end: “how often do I want to redeem a small reward, and what should earning that feel like?” Then I worked backwards.

    Design from rewards down, not tasks up.

    I also capped how many tasks count toward stars per day, weighted difficulty over activity, and made the daily maximum slightly out of reach, so a perfect day feels genuinely earned, not automatic.

    Building this taught me something I couldn’t have learned by reading about product design.

    When you are both the designer and the user, every decision becomes personal. There’s no “target persona” . You know exactly what will make you open the app at 9pm when you’re exhausted, and what will make you delete it.

    That clarity is brutal and useful at the same time.

    A few things I now understand differently:

    Behavioral economics isn’t abstract theory. It’s the difference between a system you use and one you abandon. Positive reinforcement only works if the reward feels earned. Over-reward and the dopamine hit disappears. Under-reward and you stop trying. The sweet spot is stars that are slightly out of reach, just enough to feel possible, never quite automatic.

    Design for your worst days, not your best ones. I built three day modes into the app.

    Full day, Low energy, and Sprint.

    The most important one is Low Energy. On hard days, two completed tasks still counts. No guilt. A system that only works when you’re feeling good isn’t a system , it’s a fair-weather friend.

    MVP scoping is a skill. The first version of this app has one job: tap a task, earn stars, feel good. No login, no social features, no complex dashboards. Just the core loop working. Everything else is next week’s problem.

    I also added a Claude-powered agentic layer that reads your day, what you’ve done, what time it is, what mode you’re in , and suggests the single best next action. But that’s a whole other post.

    I’m building this in public.

    Next post: what makes an AI system actually agentic, and why the context you give it matters more than the prompt.

    If you’re building something similar or have thoughts on reward design, I’d love to hear from you.

    Follow along at notesfromshivani.com/technology or on LinkedIn.

    More product thinking + building in public

    If this was useful, there’s more where it came from.

  • How CUPED Makes Your A/B Tests Smarter (Without Needing More Users)

    How CUPED Makes Your A/B Tests Smarter (Without Needing More Users)

    Photo by Scott Graham on Unsplash

    Have you ever run an A/B test and found… nothing?

    No clear difference. Just noisy results that leave you wondering, “Did this even do anything?”

    That’s where CUPED comes in — short for Controlled Using Pre-Experiment Data. It sounds complicated, but the idea is surprisingly simple: Use what you already know about users to get better results — faster.

    So what is CUPED, in plain English?

    CUPED is a way to reduce the random noise in your experiment results by using data you already have from before the experiment started.

    Let’s say you’re testing a new homepage design to see if more users sign up.
    You know how often each user visited your site last week, before they saw the new page.

    That number (past visits) often affects how likely they are to sign up this week. CUPED uses that pattern to adjust your results so you can see the real impact of your new design, not just luck.

    A simple example

    You split users into two groups:

    • Control: sees the old homepage
    • Treatment: sees the new homepage

    But what if your treatment group just happened to include more people who visited last week? They might naturally sign up more — even if the new page did nothing.

    CUPED solves that by adjusting for past visits. You run a simple regression:

    signups_this_week ~ visits_last_week

    This gives you a coefficient (let’s call it θ, theta) showing how much past visits predict sign-ups.

    Then, for each user, you subtract the part of their behavior explained by past visits.
    Now you compare adjusted sign-up numbers across groups, a fairer, clearer test.

    adjusted_signups = signups_during_test - θ * (visits_last_week - average_visits)

    This removes the part of the outcome that’s explained just by past behavior. Now run your A/B test using these adjusted sign-up numbers.
    They have less random noise, so differences between control and treatment are easier to detect.

    Why use CUPED?

    1. More statistical power without needing more users
    2. Shorter experiments
    3. Cleaner insights even with messy data

    It’s especially useful when your metric is noisy or your sample size is small.

    When NOT to use CUPED?

    CUPED works best when your pre-experiment metric is strongly correlated with your outcome.
    If it isn’t, it might not help, it could add noise.

    So always test that correlation first!

    Want to go deeper?

    This post was inspired in part by Lyft’s blog on experimentation and Microsoft Research’s paper on CUPED: Variance Reduction in Online Controlled Experiments via Pre-Experiment Data.

    Official Source:

    Title: Variance Reduction in Online Controlled Experiments via Pre-Experiment Data
    Authors: Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, Nils Pohlmann
    Link: https://www.researchgate.net/publication/237838291_Improving_the_Sensitivity_of_Online_Controlled_Experiments_by_Utilizing_Pre-Experiment_Data

    CUPED is like giving your experiment a smarter starting point.
    By using what you already know about your users, you can get faster answers and make better decisions with no extra data required. You’re subtracting the influence of past behavior to better isolate the true effect of your test.

    If you found this helpful or want to dig deeper into data science insights, check out the Technology Blog section on NotesfromShivani, a space where I break down complex ideas in a clear, simple way. And if you’d like to connect, chat, or share your thoughts, I’m always up for a good data conversation over on LinkedIn. Come say hi!