This is one of my notes of the online Causal Inference Course in Columbia University, taught by Michael E. Sobel who is a professor in the Department of Statistics. It would be good to have this overview of causal inference regarding the framework in mind before getting into statistical and theoretical details, especially for beginners. A clear approach framework is very important that it’s not only because rigorous experiment and analysis methods could be developed with this well-defined framework, but also because it can guide you to deal with challenging situations in a correct way, while being clear about the limitations and assumptions at the same time. This is why it became a Science.

Modern Approach for Causal Inference

The modern dominant approach for causal inference, significantly influenced by Donald Rubin’s contributions, primarily revolves around the following key ideas:

  1. Potential Outcomes Framework:

    • Potential Outcomes Notation: Introduced by Neyman and further developed by Rubin, this framework involves conceptualizing the outcomes that would occur both with and without the treatment for each unit. Each unit has a potential outcome under treatment and a potential outcome under control, but only one of these outcomes is observed for each unit.
    • Average Treatment Effects: The focus is on estimating the average causal effect of a treatment across a population. This involves comparing the average outcomes of treated and untreated groups, taking into account the potential outcomes framework.
  2. Randomization and Its Analogs:

    • Role of Randomization: In experimental studies, random assignment of treatments is crucial for ensuring that the treatment groups are comparable, allowing for unbiased estimation of causal effects.
    • Randomization-like Conditions in Observational Studies: Rubin extended the framework to observational studies by arguing that causal inferences can be made if these studies fulfill conditions similar to randomization. This involves controlling for confounding variables that influence both the treatment and the outcome, often through methods like matching, regression adjustment, or instrumental variables.
  3. Counterfactual Reasoning:

    • Counterfactual Conditionals: Causal relationships must satisfy counterfactual conditions. This means that for a cause to be deemed responsible for an effect, it should be demonstrable that if the cause had not occurred, the effect would not have occurred. This is formalized through the potential outcomes framework.

Key Features of Rubin’s Approach:

  • Application to Both Experimental and Observational Studies: Rubin’s framework is versatile and can be applied to both types of studies, providing a unified approach to causal inference.
  • Focus on Estimating Causal Effects: The primary goal is to estimate the causal effect of treatments or interventions, rather than simply identifying associations.
  • Use of Statistical Methods: The approach leverages statistical methods to control for confounding variables and to estimate causal effects, emphasizing the importance of rigorous statistical analysis.

Two Key Criteria of Modern Causal Inference

The modern dominant approach to causal inference primarily builds on two key criteria:

  1. Causation at the Singular Level:

    • This criterion allows for the possibility that causation can be specific to individual subjects or units, acknowledging effect heterogeneity. It means that a cause may produce an effect in one individual but not necessarily in another, depending on various conditions.
  2. Satisfaction of Counterfactual Conditionals:

    • A causal relationship must sustain a counterfactual conditional. This means that for a cause to be deemed responsible for an effect, it should be demonstrable that if the cause had not occurred, the effect would not have occurred. This criterion is essential for defining and reasoning about causal relationships in both experimental and observational studies.

Impact on Empirical Research:

Rubin’s contributions have led to more careful and precise inferences about causal effects in various disciplines, particularly in the social sciences. Researchers now more rigorously design studies and analyze data to ensure that their conclusions about causality are well-founded within this robust statistical framework.

Challenges in Randomized Studies:

  1. Non-compliance with Treatment Assignments:

    • Example: In a study by the University of Michigan in the 1990s, unemployed persons were assigned to receive or not receive assistance in job searching. A significant percentage of those assigned to the treatment group did not actually take the treatment, complicating the comparison between groups and potentially overestimating the treatment’s effectiveness.
  2. Intermediate Variables:

    • Example: An educational researcher wants to know the effect of encouragement to study on test scores. While the researcher can estimate the effect of encouragement, estimating the direct effect of study time is more complicated because it involves intermediate variables (encouragement affecting study time, which in turn affects test scores).
  3. Breakdown of Random Assignment:

    • Example: If a subject’s treatment adherence is influenced by their perception of treatment benefits, comparing only those who comply can lead to biased estimates.

Challenges in Observational Studies:

  1. Identifying and Measuring All Covariates:

    • Example: When studying the effect of education on earnings, researchers must account for various covariates that affect both education levels and earnings. Failure to identify or measure all relevant covariates can lead to biased estimates.
  2. Estimating Average Treatment Effects:

    • Example: In observational studies, various methods like matching, weighting, and regression are used to estimate treatment effects. Each method has its own set of practical issues and assumptions that need to be carefully managed.
  3. Longitudinal Observational Studies:

    • Example: When treatments administered in different periods depend on previous treatments and outcomes, analysis becomes more complicated.
  4. Interference:

    • Example: In a housing experiment conducted by the U.S. government, participants assigned to move from housing projects to suburbs knew each other. If the treatment assignment of one participant influenced the decision or outcome of another, traditional analysis methods might not be adequate.

These examples illustrate that both randomized and observational studies require careful consideration of various factors to ensure accurate and reliable causal inferences.

Potential Outcomes, Unit, and Average Effect

Potential Outcomes Framework

  1. Potential Outcomes:
    • Each unit (e.g., individual) has two potential outcomes: one if treated and one if not treated. However, only one outcome can be observed for each unit.
    • This leads to the “fundamental problem of causal inference,” where we cannot observe both potential outcomes for a single unit.
  2. Notation and Unit Effects:
    • For a unit \(i\), denote the outcome as \(Y\_i (1)\) if treated and \(Y\_i (0)\) if not treated.
    • The unit effect is defined as the difference \(Y\_i (1)−Y\_i (0)\).
    • The observed outcome \(Y_i\) is determined by the treatment assignment \(Z\_i\), where \(Z\_i=1\) if the unit is treated and \(Z\_i=0\) if not.
  3. Randomized vs. Observational Studies:
    • In randomized experiments, treatment assignment \(Z\_i\) is random.
    • In observational studies, subjects choose their treatment, introducing potential biases.

Average Treatment Effects

  1. Sample Average Treatment Effect (SATE):
    • The average of the unit effects for the sample.
  2. Finite Population Average Treatment Effect (FATE):
    • The average treatment effect for a finite population from which the sample is drawn.
  3. Average Treatment Effect (ATE):
    • The average treatment effect in an infinite or large population. This is treated as an expectation of the potential outcomes.
  4. Estimands of Interest:
    • Various estimands depend on the marginal distributions of potential outcomes, such as ATE and Average Treatment Effect on the Treated (ATT).
  5. Challenges and Assumptions:
    • Estimating these effects requires assumptions like the Stable Unit Treatment Value Assumption (SUTVA), which ensures that the potential outcomes are well-defined and not affected by other units’ treatments.

Practical Implications

  1. Decision-Making:
    • Knowledge of average treatment effects aids decision-making in contexts like medical treatments and policy implementations.
  2. Ignorability Conditions:
    • Under certain conditions, known as ignorability or unconfoundedness, it is possible to use observed data to estimate causal effects reliably.
  3. Extensions and Assumptions:
    • The framework extends to multiple treatments and continuous treatments, though additional assumptions may be required.
    • SUTVA assumes no alternative representations of treatment and no interference between units, which may need adjustments in certain studies.

Conditions Allow Average Effects be Unbiasedly/ Consistently Estimated

Key Concepts

  1. Average Treatment Effect (ATE) Estimation:
    • Random Sampling: Drawing random samples of treated (\(y\_1\)) and untreated (\(y\_0\)) units to estimate their respective means.
    • Sample Means: The means of treated (\(\\bar{y}\_1\)) and untreated (\(\bar{y}\_0\)) samples serve as unbiased and consistent estimators of the population means.
  2. Unconfoundedness:
    • Definition: Treatment assignment \(z\) is independent of potential outcomes (\(y\_0\) and \(y\_1\)).
    • Intuition: In randomized experiments, treatment assignment is blind to potential outcomes, ensuring unconfoundedness. In observational studies, treatment assignment might depend on factors related to potential outcomes, potentially confounding the estimates.

Examples

  1. Randomized Experiment vs. Observational Study:
    • Randomized Experiment: Treatment assignment is random (e.g., coin flip), ensuring \(z\) is independent of \(y\_0\) and \(y\_1\).
    • Observational Study: Treatment assignment may depend on patient characteristics, potentially leading to biased estimates.
  2. Age and Treatment Example:
    • Scenario: Older patients might forego treatment believing it’s less beneficial, while younger patients might opt for treatment believing it’s more beneficial.
    • Consequence: Naive comparison between treated and untreated groups might overestimate the treatment effect due to confounding by age.

Ignorability Condition

  1. Condition: \(y\_0\) and \(y\_1\) are independent of \(z\) given covariates \(x\) (e.g., age).
    • Stratified Analysis: In both randomized experiments and observational studies, stratifying on covariates like age can help achieve conditional unconfoundedness.
  2. Adjusting for Covariates:
    • Randomized Experiment: Can stratify on covariates either before or after the experiment.
    • Observational Study: Treat it as a stratified randomized experiment by conditioning on covariates related to treatment status and potential outcomes.

Practical Implications

  1. Comparison of Groups:
    • In a stratified randomized experiment, compare treated and control groups within each stratum (e.g., age group) to estimate ATE.
    • In observational studies, stratify on covariates to reduce bias and estimate ATE as if it were a stratified randomized experiment.
  2. Challenges in Observational Studies:
    • Unknown Assignment Mechanism: Unlike randomized experiments, the assignment mechanism in observational studies is not controlled, making it harder to ensure unconfoundedness.
    • Measurement of Confounders: It’s crucial to measure and account for all relevant confounders, though it may not always be possible.

dare_to_lead_book

I’ve never looked so deeply into my feelings inside until I met this book written by Brené Brown. I fell in love with it immediately when I saw the quote from Theodore Roosevelt at the very beginning:

It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again… who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly.

Part 1. Rumbling with Vulnerability

Definition of the courage to be vulnerable:

The courage to be vulnerable is not about winning or losing, it’s about the courage to show up when you can’t predict or control the outcome. The only thing I know for sure after all of this research is that if you’re going to dare greatly, you’re going to get your ass kicked at some point. If you choose courage, you will absolutely know failure, disappointment, setback, even heartbreak. That’s why we call it courage. That’s why it’s so rare.

Definition of rumbling with vulnerability. It’s a foundational and core skill of courage building, and “our ability to be daring leaders will never be greater than our capacity for vulnerability”.

A rumble is a discussion, conversation, or meeting defined by a commitment to lean into vulnerability, to stay curious and generous, to stick with the messy middle of problem identification and solving, to take a break and circle back when necessary, to be fearless in owning our parts, and, as psychologist Harriet Lerner teaches, to listen with the same passion with which we want to be heard.

Section 1. The Moment and The Myths

Good practices:

  1. Have the courage to show up when you can’t control the outcome

The definition of vulnerability is the emotion that we experience during times of uncertainty, risk, and emotional exposure. Vulnerability is not winning or losing. It’s having the courage to show up when you can’t control the outcome.

  1. Step over cheap-seat feedback and keep daring

If you are not in the arena getting your ass kicked on occasion, I’m not interested in or open to your feedback. There are a million cheap seats in the world today filled with people who will never be brave with their lives but who will spend every ounce of energy they have hurling advice and judgment at those who dare greatly. Their only contributions are criticism, cynicism, and fearmongering. If you’re criticizing from a place where you’re not also putting yourself on the line, I’m not interested in what you have to say.

Don’t grab hurtful comments and pull them close to you by rereading them and ruminating on them. Don’t play with them by rehearsing your badass comeback. And whatever you do, don’t pull hatefulness close to your heart.

  1. Don’t shield ourselves from all feedback

Again, if we shield ourselves from all feedback, we stop growing. If we engage with all feedback, regardless of the quality and intention, it hurts too much, and we will ultimately armor up by pretending it doesn’t hurt, or, worse yet, we’ll disconnect from vulnerability and emotion so fully that we stop feeling hurt. When we get to the place that the armor is so thick that we no longer feel anything, we experience a real death. We’ve paid for selfprotection by sealing off our heart from everyone, and from everything-not just hurt, but love.

The six misguided myths of vulnerability:

  1. Vulnerability is weakness
  2. I don’t do vulnerability

Choosing to own our vulnerability and do it consciously means learning how to rumble with this emotion and understand how it drives our thinking and behavior so we can stay aligned with our values and live in our integrity. Pretending that we don’t do vulnerability means letting fear drive our thinking and behavior without our input or even awareness, which almost always leads to acting out or shutting down.

  1. I can do it alone
  2. You can engineer uncertainty out of vulnerability
  3. Trust comes before vulnerability

We need to trust to be vulnerable, and we need to be vulnerable in order to build trust.

Trust is the stacking and layering of small moments and reciprocal vulnerability over time. Trust and vulnerability grow together, and to betray one is to destroy both.

And I like this marble jar approach:

We trust the people who have earned marbles over time in our life. Whenever someone supports you, or is kind to you, or sticks up for you, or honors what you share with them as private, you put marbles in the jar. When people are mean, or disrespectful, or share your secrets, marbles come out. We look for the people who, over time, put marbles in, and in, and in, until you look up one day and they’re holding a full jar. Those are the folks you can tell your secrets to. Those are the folks you trust with information that’s important to you.

  1. Vulnerability is disclosure

Section 2. The Call to Courage

Leaders must either invest a reasonable amount of time attending to fears and feelings, or squander an unreasonable amount of time trying to manage ineffective and unproductive behavior.

  1. Hunt treasures

This is when I remember Joseph Campbell’s quote, which I believe is one of the purest calls to courage for leaders: “The cave you fear to enter holds the treasure you seek.

When we are in fear or in self-protection, these are the patterns of how we assemble our armor. And they will NOT lead us to anywhere.

  1. I’m not enough

  2. If i’m honest about what’s happening, they will think less of me or maybe use it against me

  3. No one else is going to be honest about what’s happening and so no way am I going to do that

  4. They are not honest about what scares them and they’ve got a lot of issues

  5. This is their fault and they are trying to blame me

  6. I’m better than them

  1. Serve people

When you find the courage to enter that cave, you’re never going in to secure your own treasure or your own wealth; you face your fears to find the power and wisdom to serve others.

Good practices:

  1. Have a one-on-one discussion
  2. Stop talking. Leave long white pauses and empty space so that we can start peeling and going deep.
  3. When they start talking. Really listen.
  4. When we are in tough rumbles with people, we can’t take responsibility for their emotions.
  5. When rumbles become unproductive, give everyone minutes to walk around outside or catch their breath.

Section 3. The Armory

The problem is that when we imprison the heart, we kill courage. In the same way that we depend on our physical heart to pump life-giving blood to every part of our body, we depend on our emotional heart to keep vulnerability coursing through the veins of courage and to engage all of the behaviors we talked about in the prior section, including trust, innovation, creativity, and accountability.

You got to put down the weapons and show up.

As children we found ways to protect ourselves from vulnerability, from being hurt, diminished, and disappointed. We put on armor; we used our thoughts, emotions, and behaviors as weapons; and we learned how to make ourselves scarce, even to disappear. Now as adults we realize that to live with courage, purpose, and connection to be the person who we long to be-we must again be vulnerable. We must take off the armor, put down the weapons, show up, and let ourselves be seen.

Forms of armored leadership with top three as perfectionism, foreboding joy, numbing:

  1. Driving Perfectionism (armored leadership) vs encouraging for healthy striving (daring leadership).

Perfectionism is not the same thing as striving for excellence. Perfectionism is not about healthy achievement and growth. Perfectionism is a defensive move.

Perfectionism is not the self-protection we think it is. It is a twenty-ton shield that we lug around, thinking it will protect us, when in fact it’s the thing that’s really preventing us from being seen.

Perfectionism is not self-improvement. Perfectionism is, at its core, about trying to earn approval. Most perfectionists grew up being praised for achievement and performance (grades, manners, rule following, people pleasing, appearance, sports). Somewhere along the way, they adopted this dangerous and debilitating belief system: I am what I accomplish and how well I accomplish it. Please. Perform. Perfect. Prove. Healthy striving is self-focused: How can I improve? Perfectionism is other-focused: What will people think? Perfectionism is a hustle.

Perfectionism is not the key to success. In fact, research shows that perfectionism hampers achievement. Perfectionism is correlated with depression, anxiety, addiction, and life paralysis, or missed opportunities. The fear of failing, making mistakes, not meeting people’s expectations, and being criticized keeps us outside the arena where healthy competition and striving unfolds.

Last, perfectionism is not a way to avoid shame. Perfectionism is a function of shame.

  1. Squandering opportunities for joy and recognition (armored leadership) vs practicing gratitude and celebrating milestones (daring leadership).

  2. Numbing (armored leadership) vs setting boundaries and finding real comfort (daring leadership)

We cannot selectively numb emotion. If we numb the dark, we numb the light. If we take the edge off pain and discomfort, we are, by default, taking the edge off joy, love, belonging, and the other emotions that give meaning to our lives.

  1. Propagating the false dichotomy of victim or viking (armored leadership) vs practicing integration - strong back, soft front, wild heart (daring leadership)

The opposite of living in a world of false binaries is practicing integration the act of bringing together all the parts of ourselves, as we talked about earlier. We are all tough and tender, scared and brave, grace and grit. The most powerful example of integrationa practice that I wrote about in Braving the Wilderness and that I try to live by-is strong back, soft front, wild heart.

How can we give and accept care with strongback, soft-front compassion, moving past fear into a place of genuine tenderness? I believe it comes about when we can be truly transparent, seeing the world clearly-and letting the world see into us.

  1. Being a knower and being right (armored leadership) vs being a learner and getting it right (daring leadership)

Having to be the “knower” or always being right is heavy armor. It’s defensiveness, it’s posturing, and , worst of all, it’s a huge driver of bullshit.

  1. Hiding behind cynicism (armored leadership) vs modeling clarity, kindness, and hope (daring leadership)
  2. Using criticism as self-protection (armored leadership) vs making contributions and taking risks (daring leadership)
  3. Using power over (armored leadership) vs using power with, power to, and power within (daring leadership)
  4. Hustling for your worth (armored leadership) vs knowing your value (daring leadership)

When people don’t understand where they’re strong and where they deliver value for the organization or even for a single effort, they hustle. And not the good kind of hustle. The kind that’s hard to be around because we are jumping in everywhere, including where we’re not strong or not needed, to prove we deserve a seat at the table. When we do not understand our value, we often exaggerate our importance in ways that are not helpful, and we consciously or unconsciously seek attention and validation of importance.”

  1. Zigzagging and avoiding (armored leadership) vs talking straight and taking action (daring leadership)

Zigzagging is a metaphor for the energy we spend trying to dodge the bullets of vulnerability whether it’s conflict, discomfort, confrontation, or the potential for shame, hurt, or criticism.

When we find ourselves zigzagging-hiding out, pretending, avoiding, procrastinating, rationalizing, blaming, lying-we need to remind ourselves that running is a huge energy suck and probably way outside our values. At some point, we have to turn toward vulnerability and make that call.

Section 4. Shame and Empathy

The definition of shame:

First, shame is the fear of disconnection. As we talked about in the myths of vulnerability, we are physically, emotionally, cognitively, and spiritually hardwired for connection, love, and belonging. Connection, along with love and belonging, is why we are here, and it is what gives purpose and meaning to our lives. Shame is the fear of disconnection-it’s the fear that something we’ve done or failed to do, an ideal that we’ve not lived up to, or a goal that we’ve not accomplished makes us unworthy of connection.

Shame is the intensely painful feeling or experience of believing that we are flawed and therefore unworthy of love, belonging, and connection.

Retreating into our smallness becomes the most seductive and easiest way to stay safe in the midst of the shame squeeze. But, as we’ve talked about, when we armor and contort ourselves into smallness, things break and we suffocate.

Behavioral cues that shame has permeated a culture:

Perfectionism; Favoritism; Gossiping; Back-channeling; Comparison; Self-worth tied to productivity; Harassment; Discrimination; Power over; Bullying; Blaming; Teasing; Cover-ups.

Shame resistance is not possible as long as we care about connection, but shame resilience is possible, learnable by all of us. We need to be empathy and self-compassion.

Shame resilience is the ability to practice authenticity when we experience shame, to move through the experience without sacrificing our values, and to come out on the other side of the shame experience with more courage, compassion, and connection than we had going into it. Ultimately, shame resilience is about moving from shame to empathy the real antidote to shame.

it’s important to understand that if we share our story with someone who responds with empathy and understanding, shame can’t survive. Self-compassion is also critically important, but because shame is a social concept-it happens between people it also heals best between people. A social wound needs a social balm, and empathy is that balm. Self-compassion is key because when we’re able to be gentle with ourselves in the midst of shame, we’re more likely to reach out, connect, and experience empathy.

The definition of empathy:

Empathy is not connecting to an experience, it’s connecting to the emotions that underpin an experience.

Empathy is a choice. And it’s a vulnerable choice, because if I were to choose to connect with you through empathy, I would have to connect with something in myself that knows that feeling. In the face of a difficult conversation, when we see that someone’s hurt or in pain, it’s our instinct as human beings to try to make things better. We want to fix, we want to give advice. But empathy isn’t about fixing, it’s the brave choice to be with someone in their darkness-not to race to turn on the light so we feel better.

If struggle is being down in a hole, empathy is not jumping into the hole with someone who is struggling and taking on their emotions, or owning their struggle as yours to fix. If their issues become yours, now you have two people stuck in a hole. Not helpful. Boundaries are important here. We have to know where we end and others begin if we really want to show up with empathy.

Empathy is at the heart of connection-it is the circuit board for leaning into the feelings of others, reflecting back a shared experience of the world, and reminding them that they are not alone.

Empathy skills:

From practical perspective, empathy is first to take the perspective of another person, second to stay out of judgment, third to understand their emotion, and fourth to communicate my understanding of their emotion.

  1. To see the world as others see it, or perspective taking

Perspective taking requires becoming the learner, not the knower.

Again, it’s only when diverse perspectives are included, respected, and valued that we can start to get a full picture of the world, who we serve, what they need, and how to successfully meet people where they are.

I love what Beyoncé said in her first-person essay in the September 2018 issue of Vogue: ”If people in powerful positions continue to hire and cast only people who look like them, sound like them, come from the same neighborhoods they grew up in, they will never have a greater understanding of experiences different from their own. They will hire the same models, curate the same art, cast the same actors over and over again, and we will all lose. The beauty of social media is it’s completely democratic. Everyone has a say. Everyone’s voice counts, and everyone has a chance to paint the world from their own perspective.”

  1. To be unjudgmental

Based on research, there are two ways to predict when we are going to judge: We judge in areas where we’re most susceptible to shame, and we judge people who are doing worse than we are in those areas.

  1. To understand another person’s feelings
  2. To communicate your understanding of that person’s feelings

Fluency in emotional conversation means being able to name at least thirty of them.

One reason emotion is difficult to identify and name is the iceberg effect.

Many of the emotions that we experience show up as pissed off or shut down on the surface. Below the surface, there’s much more nuance and depth. Shame and grief are two examples of emotions that are hard to fully express, so we turn to anger or silence.

The vast majority of us find it easier to be mad than hurt. Not only is it easier to express anger than it is to express pain, our culture is more accepting of anger. So the next time you’re shutting down or angry, ask yourself what lies beneath.

  1. Mindfulness / Paying attention

Self-compassion skills

  1. Maintain clear line

Do not take responsibility and ownership for the words of other people-just own your part.

Jumping into the hole with no way out is enmeshment-jumping into struggle with someone while maintaining clear lines about what belongs to whom is empathy.

  1. Stop beating yourself

Talk to yourself the way you’d talk to someone you love.

Four elements of shame resilience:

  1. Recognizing shame and understanding its triggers

When we have understanding and awareness around shame, we are less likelyy to default to our shame shields or the following three strategies of disconnection:

Moving away: Withdrawing, hiding, silencing ourselves, and keeping secrets. Moving toward: Seeking to appease and please. Moving against: Trying to gain power over others by being aggressive, and by using shame to fight shame.

  1. Practicing critical awareness
  2. Reaching out
  3. Speaking shame

Section 5. Curiosity and Grounded Confidence

Dheeraj explained to me that when leaders don’t have the skills to lean into vulnerability, they’re not able to successfully hold the tension of the paradoxes that are inherent in entrepreneurship. His examples of the paradoxes that elicit vulnerability in leaders align with what we heard from the research participants: • Optimism and paranoia • Letting chaos reign (the act of building) and reining in chaos (the act of scaling) • Big heart and tough decision making • Humility and fierce resolve • Velocity and quality when building new things • Left brain and right brain • Simplicity and choice • Thinking global, acting local • Ambition and attention to detail • Thinking big but starting small • Short-term and long-term • Marathons and sprints, or marathon of sprints in business-building Dheeraj told me, “Leaders must learn the skills to hold these tensions and get adept at balancing on the ‘tightrope’ of life. Ultimately, leadership is the ability to thrive in the ambiguity of paradoxes and opposites.”

How to build skills to hold tensions of the paradoxes:

  1. Rumble skills: easy learning does not build strong skills

The reality is that to be effective, learning needs to be effortful. That’s not to say that anything that makes learning easier is counterproductive-or that all unpleasant learning is effective. The key here is desirable difficulty. The same way you feel a muscle “burn” when it’s being strengthened, the brain needs to feel some discomfort when it’s learning. Your mind might hurt for a while-but that’s a good thing.

  1. Curiosity

In his book Curious: The Desire to Know and Why Your Future Depends on It, Ian Leslie writes, “Curiosity is unruly. It doesn’t like rules, or, at least, it assumes that all rules are provisional, subject to the laceration of a smart question nobody has Yet thought to ask. It disdains the approved pathways, preferring diversions, unplanned excursions, impulsive left turns. In short, curiosity is deviant.”

  1. Practice vulnerability, become self-aware, and engage in tough conversations

There’s an old saying that I lead by now: “People don’t care how much you know until they know how much you care.” I’ve learned one way to help people understand how much you care is to share your story.

Part 2. Living into Our Values

Values and living into our values:

A value is a way of being or believing that we hold most important. Living into our values means that we do more than profess our values, we practice them. We walk our talk-we are clear about what we believe and hold important, and we take care that our intentions, words, thoughts, and behaviors align with those beliefs.

More often than not, our values are what lead us to the arena door-we’re willing to do something uncomfortable and daring because of our beliefs. And when we get in there and stumble or fall, we need our values to remind us why we went in, especially when we are facedown, covered in dust and sweat and blood. Here’s the thing about values: While courage requires checking our armor and weapons at the arena door, we do not have to enter every tough conversation and difficult rumble completely empty-handed.

Three steps to help you know more about yourself and how to live into your values:

  1. We can’t live into values that we can’t name
  2. Taking values from BC to behavior
  3. Empathy and self-compassion: the two most important seats in the arena

Regardless of the values you pick, daring leaders who live into their values are never silent about hard things.

“You first listen about race. You will make a lot of mistakes. It will be super uncomfortable. And there’s no way to talk about it without getting some criticism. But you can’t be silent.” To opt out of conversations about privilege and oppression because they make you uncomfortable is the epitome of privilege.

Silence is not brave leadership, and silence is not a component of brave cultures. Showing up and being courageous around these difficult conversations is not a path you can predetermine. A brave leader is not someone who is armed with all the answers. A brave leader is not someone who can facilitate a flawless discussion on hard topics. A brave leader is someone who says I see you. I hear you. I don’t have all the answers, but I’m going to keep listening and asking questions. We all have the capacity to do that. We all have the ability to foster empathy. If we want to do good work, it’s imperative that we continue to flesh out these harder conversations, to push against secrecy, silence, and judgment. It’s the only way to eradicate shame from the workplace, to clear the way for a performance in the arena that correlates with our highest values and not the fearmongers from the stands.

The biggest challenge we face when it comes to values is the necessity to give feedback and receive feedback. You have to know when you are ready to give feedback and be good at receiving feedback.

  1. Understand their values

You don’t really know people until you take the time to understand their values.

  1. Daring leaders assume the best about people’s intention and assume they are doing the best they can. Leaders struggling with ego, armor, and/or a lack of skills do not make that assumption.

What is the foundational skill of assuming the best in people? Setting and maintaining boundaries. What’s the fundamental belief underpinning the assumption of positive intent? That people are doing the best they can.

The people who are the most generous in their assumptions of others have the clearest boundaries. The most compassionate and generous people I’ve interviewed in my career are the most boundaries. It turns out that we assume the worst about people’s intentions when they’re not respectful of our boundaries: It is easy to believe that they are trying to disappoint us on purpose. However, we can be very compassionate toward people who acknowledge and respect what’s okay and what’s not.

In addition to boundaries, an assumption of positive intent relies on the core belief that people are doing the best they can with what they’ve got, versus that people are lazy, disengaged, and maybe even trying to piss us off on purpose. Sure, we’re all capable of change and growth, but assuming positive intent requires the belief that people are really trying in that moment.

Assuming positive intent does not mean that we stop helping people set goals or that we stop expecting people to grow and change. It’s a commitment to stop respecting and evaluating people based solely on what we think they should accomplish, and start respecting them for who they are and holding them accountable for what they’re actually doing. And when we’re overwhelmed and struggling, it also means turning those positive assumptions toward ourselves: I’m doing the very best I can right now.

Part 3. Braving Trust

Importance of talking about trust:

Because talking about trust is tough, and because these conversations have the potential to go sideways fast, we often avoid the rumble. And that’s even more dangerous. First, when we’re struggling with trust and don’t have the tools or skills to talk about it directly with the person involved, it leads us to talk about people instead of to them. It also leads to lots of energy-wasting zigzagging.

To measure individual level of trustworthiness, you can refer to the following seven behaviors - BRAVING inventory:

Boundaries: You respect my boundaries, and when you’re not clear about what’s okay and not okay, you ask. You’re willing to say no. Reliability: You do what you say you’ll do. At work, this means staying aware of your competencies and limitations so you don’t overpromise and are able to deliver on commitments and balance competing priorities. Accountability: You own your mistakes, apologize, and make amends. Vault: You don’t share information or experiences that are not yours to share. I need to know that my confidences are kept, and that you’re not sharing with me any information about other people that should be confidential. Integrity: You choose courage over comfort. You choose what is right over what is fun, fast, or easy. And you choose to practice your values rather than simply professing them. Nonjudgment: I can ask for what I need, and you can ask for what you need. We can talk about how we feel without judgment. We can ask each other for help without judgment. Generosity: You extend the most generous interpretation possible to the intentions, words, and actions of others.

Unpacking Vault:

When I walk into a co-worker’s office and spill, there might be a moment of connection, but it’s counterfeit connection. The second I walk out, that colleague is likely thinking, “I should be careful about what I tell Brené; she’s got no boundaries.”

Unpacking nonjudgment:

We are afraid of being judged for a lack of knowledge or lack of understanding, so we hate asking questions.

We asked a thousand leaders to list marble earning behaviors-what do your team members do that earns your trust? The most common answer: asking for help. When it comes to people who do not habitually ask for help, the leaders we polled explained that they would not delegate important work to them because the leaders did not trust that they would raise their hands and ask for help.

Trust is built in small moments. If you struggle with reliability, make small and doable promises to yourself that are easy to fulfill, until you get a flywheel of reliability going again. If you struggle with boundaries, set small ones with your partner-like you will not be responsible for both cooking and cleaning up dinner-until you are adept at putting boundaries into action in a more meaningful way. That’s how you fill your own marble jar. And never forget-we can’t give people what we don’t have.

Part 4. Learning to Rise

We can’t expect people to be brave and risk failure if they’re not prepped for hard landings.

Here’s the bottom line: If we don’t have the skills to get back up, we may not risk falling. And if we’re brave enough often enough, we are definitely going to fall. The research participants who have the highest levels of resilience can get back up after a disappointment or a fall, and they are more courageous and tenacious as a result of it. They do that with a process that I call Learning to Rise. It has three parts: the reckoning, the rumble, and the revolution.

Three steps process for learning to rise:

When we have the courage to walk into our story and own it, we get to write the ending. And when we don’t own our stories of failure, setbacks, and hurt-they own us.

  1. The Reckoning

The reckoning is as simple as that: knowing that we’re emotionally hooked and then getting curious about it.

The ego doesn’t own stories or want to write new endings; it denies emotion and hates curiosity. Instead, the ego uses stories as armor and alibi. The ego says “Feelings are for losers and weaklings.”

The most effective strategy for staying with emotion instead of offloading it is something I learned from a yoga teacher. And from a few members of the military Special Forces. It’s breathing.

Breathing is also the key to another strategy for reckoning with emotion, and one of the most underrated leadership superpowers: practicing calm.

I define calm as creating perspective and mindfulness while managing emotional reactivity.

Calm is a superpower because it is the balm that heals one of the most prevalent workplace stressors: anxiety.

  1. Rumble: conspiracies, confabulations, and shitty first drafts

If the reckoning is how we walk into a tough story, the rumble is where we go to the mat with it and own it.

The rumble starts with this universal truth: In the absence of data, we will always make up stories. It’s how we are wired. Meaning making is in our biology, and when we’re in struggle, our default is often to come up with a story that makes sense of what’s happening and gives our brain information on how best to self-protect. And it happens a hundred times a day at work.

In our SFDs, fear fills in the data gaps. What makes that scary is that stories based on limited real data and plentiful imagined data, blended into a coherent, emotionally satisfying version of reality, are called conspiracy theories. Yes, we are all conspiracy theorists with our own stories, constantly filling in data gaps with our fears and insecurities.

Confabulation has a really great and subtle definition: A confabulation is a lie told honestly. To confabulate is to replace missing information with something false that we believe to be true.

Confabulation shows up at work when we share what we believe is factual information, but it’s really just our opinion.

Gottschall writes, “Conspiracy is not limited to the stupid, the ignorant, or the crazy. It is a reflex of the storytelling mind’s compulsive need for meaningful experience.” The problem is that rather than rumbling with vulnerability and staying in uncertainty, we start to fill in the blanks with our fears and worst-case-scenario planning. I love this line from Gottschall: “To the conspiratorial mind, shit never just happens.”

The three most dangerous stories we make up are the narratives that diminish our lovability, divinity, and creativity.

The reality check around our lovability: Just because someone isn’t willing or able to love us, it doesn’t mean that we are unlovable.

The reality check around our divinity: No person is ordained to judge our divinity or to write the story of our spiritual worthiness.

The reality check around our creativity: Just because we didn’t measure up to some standard of achievement doesn’t mean that we don’t possess gifts and talents that only we can bring to the world. And just because someone failed to see the value in what we can create or achieve doesn’t change its worth or ours.

When we own a story and the emotion that fuels it, we get to simultaneously acknowledge that something was hard while taking control of how that hard thing is going to end. We change the narrative. When we deny a story and when we pretend we don’t make up stories, the story owns us. It drives our behavior, and it drives our cognition, and then it drives even more emotions until it completely owns us.

  1. The Revolution

I’m not afraid of the word revolution, I’m afraid of a world that’s becoming less courageous and authentic. I’ve always believed that in a world full of critics, cynics, and fearmongers, taking off the armor and rumbling with vulnerability, living into our values, braving trust with open hearts, and learning to rise so we can reclaim authorship of our own stories and lives is the revolution. Courage is rebellion.

Revolution might sound a little dramatic, but in this world, choosing authenticity and worthiness is an absolute act of resistance. Choosing to live and love with our whole hearts is an act of defiance. You’re going to confuse, piss off, and terrify lots of people-including yourself. One minute you’ll pray that the transformation stops, and the next minute you’ll pray that it never ends. You’ll also wonder how you can feel so brave and so afraid at the same time. At least that’s how I feel most of the time … brave, afraid, and very, very alive.

Own the fear, find the cave, and write a new ending for yourself, for the people you’re meant to serve and support, and for your culture. Choose courage over comfort. Choose whole hearts over armor. And choose the great adventure of being brave and afraid. At the exact same time.

Substack

Playing defense: How to control the narrative if your work is being questioned - Wes Kao’s Newsletter [Link]

It’s normal that people will misunderstand and disagree with you. what we need to do is to 1) learn to explain your ideas better and 2) stay calm and share your thought process in the most objective way possible.

Defending your thinking means to share logic, evidence and rationale that explains why you believe your conclusion is the right one. It’s not to try to protect your ego by refusing to acknowledge a good argument, and being delusional about the strength of your claim. Being able to play defense is important because it’s about your credibility. If you do it well you are building more trust, otherwise you will be diminishing trust.

Some suggestions mentioned in this article:

  • Have a rationale of every small decision you made.
  • Try to anticipate questions.
  • Embrace “show, not tell”
  • React as positively as possible. e.g. “Ah! I’m so glad you asked”.
  • Consider the question behind the question.
  • Be happy that the person voiced their concern.
  • Beware of insecure vibes.

If you overcompensate, you’ll come across as defensive. This decreases your credibility too. You’ll need to use your judgment and read the situation. An open, curious, and almost playful attitude shows you’re not afraid of hard questions.

Many people underestimate the daily moments where your credibility can either be reinforced or eroded. This might sound dramatic, but it’s quite banal: Every interaction folks have with you gets added to their subconscious cumulative repository of data points about you.

Insecure vibes are subconscious clues and signals that you might be giving off when you’re feeling anxious, nervous, or uncertain. Get rid of insecure vibes—and your writing, meetings, presentations, negotiations, and pitches will become stronger.

― “Insecure vibes” are a self-fulfilling prophecy - Wes Kao’s Newsletter [Link]

In the following situations, insecure vibes happen:

  • When other person touched on your sore spot and you feel threatened

  • You assume the person will say no before you even start

    This can make you talk fast - showing you enter the conversation already playing defense. You don’t give the person a chance to say yes because you’ve already said no to yourself.

  • You insist on email or slack when you know a phone call is better

    Explaining your point in writing is a sign of lacking confidence and avoiding confrontation.

  • You over-explain because you expect the other person to be skeptical

    You bring up counterpoints to arguments no one has mentioned. Not a good time to do so. It looks like you intentionally bring up something new to surprise others rather than having a reasonable justification of your point.

To avoid being doubtful if you actually feel confident:

  • Don’t preface your idea with too many caveats. Speak in complete sentences. Remove “ands” and “buts” that create never-ending sentences, which can sound less authoritative.
  • Notice if you start to ramble. Try to prepare the first few lines of what you want to say to kick off a meeting, so you start strong.
  • Practice your actual script so you get comfortable saying those words.

To get rid of insecure vibes, ask yourself

  • Could this be interpreted as sounding defensive?
  • Am I overcompensating or overexplaining?
  • How would I respond on my best day?
  • Would I say this if I felt secure?

Strategy, not self-expression: How to decide what to say when giving feedback - Wes Kao’s Newsletter [Link]

Ask yourself “Is this strategy or self expression?” before giving feedback.

  • Do not self-express your feeling or complain. Do not say anything that does not motivate them to change. Instead, say things that get you closer to changing the person’s behavior.

How to focus on strategy rather than self-expression:

  • Mentally forgive this person

  • Identify what is most likely to motivate them to change,

  • Say only 10% that will actually change behavior, thinking about:

    • How does this make them even more effective?
    • How will this allow them to work even better with the people around them?
    • How does this get them closer to their goals?
    • How is this a skill they can apply now and in all future roles?
  • Don’t trigger the defensiveness in the first place

    The minute your recipient gets defensive, it becomes a lot harder to undo the defensiveness and get them to accept what you’re saying.

  • Let the other person talk, e.g. “I’d love to hear what you think. What parts are resonating most with you?”

    The other benefit of letting the other person talk is cognitive dissonance: if they say out loud what they are committed to doing differently, they are reinforcing the idea in their own mind.

  • Keep your eyes on the person’s behavior change

  • Always be framing

    “Strategy, not self-expression” applies to many more situations too.

  1. The more controversial the idea, the higher the burden of proof.
  2. Update your assumptions about how you add value.
  3. Share where your hunch is coming from—because it’s coming from somewhere.
  4. Describe why the problem matters, so people understand why you’re speaking up.
  5. Don’t rely on your credentials. Your idea should make sense on its own.
  6. Use language that accurately reflects your level of certainty.

― How to share your point of view (even if you’re afraid of being wrong) - Wes Kao’s Newsletter [Link]

Every week, we make business cases at work. I’m defining a business case as any recommendation to pursue a business opportunity or solve a problem. A business case can be a 5-page document, 5 sentences in Slack, or a 5-minute phone call. The larger the project, the more you may need to make a comprehensive business case. But the underlying premise is the same: If you don’t explain why a problem matters, your colleagues won’t have the necessary information to decide how to support you.

― The #1 question every business case should answer - Wes Kao’s Newsletter [Link]

Skilled immigration is a national security priority - Noahpinion [Link]

Skilled immigration should be supported while illegal immigration should be avoided. Also, avoid US education system to become corrupt system for immigration.

The low road, the high road, and the way the wind blows - Silver Bulletin [Link]

Paramount Merges With Skydance - App Economy Insights [Link]

PARA agreed to merge with Skydance. The new CEO is Skydance Media CEO David Ellison whose father Larry Ellison is the founder of Oracle.

Old paramount businesses: 1) filmed entertainment (Paramount Pictures and Nickelodeon movie), 2) TV media (Paramount’s broadcast) and cable television networks, like CBS and MTV, 3) Direct to consumer streaming services like Paramount+ and Pluto TV. Its current problems: 1) flat revenue, 2) streaming services are losing, 3) high long-term debt but low cash.

New Paramount Plan: 1) unify marquee rights, 2) reorganize finance, 3) transition into a tech-media hybrid.

For the #3, the vision is to better position Paramount on the front end (DTC apps) and the back end (cloud infrastructure, cloud-based production, and AI tools). Specifically, they are going to 1) rebuild DTC into a differentiated platform, 2) build studio-in-the-cloud, 3) leverage Gen AI.

Fintech Shake-Up - App Economy Insights [Link]

Apple Pay unveiled a new peer-to-peer (P2P) feature called “Tap to Cash”, which is a natural evolution of the existing “Tap to Pay”. This is not the only case where big tech offers features that directly compete with financial institutions: Google Wallet, Amazon’s lending program for sellers, etc, competing with PayPal’s Venmo, Block’s Cash App, etc. Big tech’s move not only intensifies competition within P2P payment, but also raises questions about the future of the payment industry.

Highlights:

  1. Visa and Mastercard both face mobile wallet threats to card-based business model.
  2. American Express renowned for its premium card offerings, targeting high spending high credit quality customers, continuing to attract millennials and Gen Z customers. The recent strategic acquisitions are Tock (a reservation platform for high end restaurant and events) and Rooam (a mobile payment and ordering platform for restaurants and venues). However, it’s sensitive to economic downturns and facing competitions for other premium card issuers and digital payment platforms.
  3. Fiserv’s merchant solutions and financial solutions look positive, it’s actively investing in digital transformation, and it recently acquired BentoBox (a digital marketing and commerce platform for restaurants).
  4. Adyen serves large enterprise clients with its unified commerce platform
  5. PayPal lowered its FY 2024 guidance. It’s facing a decline in active accounts. To solve this, it focuses on strategic partnerships such as collaboration with Apple. It’s facing competitive pressure from Big Tech payment solutions like Apple Pay and Google Pay. It’s currently undergoing significant restructuring such as layoffs and leadership change. And it’s initiating AI-powered personalized ads platform and strengthening relationships with SME customers.
  6. Block’s growth is driven by its momentum of Cash App ecosystem, square ecosystem, and significant investment in Bitcoin. However, it’s facing challenges of regulatory scrutiny on cryptocurrency activities and compliance practices, competition from established competitors and fintech startups, needs of balancing growth and profitability given FY 2024 guidance.

What to watch for the shift in payment landscape:

  • Facing legal battle from Big Tech, can Visa and Mastercard find innovation or new solutions to maintain their revenue streams from swipe fee?
  • Will Digital Wallet dominant payment industry? Will traditional cards remain or be replace?
  • New possibilities of payments have been developed such as Buy Now and Pay Later (BNPL). Will innovations gain mainstream adoption in payment industry?
  • Can the challenges of cryptocurrencies be overcome? Will cryptocurrencies be mainstream adoption?
  • As consumers are demanding seamless, secure, cheaper, and personalized payment experiences, companies that can provide such services will become successful in the future.

Nike: Losing Its Swoosh? - App Economy Insights [Link]

Nike’s facing challenges of 1) shifting consumer preferences to newer brands (On and Hoka), 2) softer traffic and lower sales of classic footwear franchises in direct-to-consumer channel, 3) macroeconomic headwinds.

Nike Q4 2024 Highlights: 1) Nike is reducing supply of classic footwear franchises to create space for newness and innovation, 2) focusing on performance and innovation, 3) Jordan Brand is still growing YoY.

Other observation: Nike brand value declined by 4% YoY, while Adidas and Lululemon gain brand values.

Broadcom: AI Surge - App Economy Insights [Link]

Broadcom now operates across two primary segments: 1) semiconductor solutions (chips for networking, server storage, broadband, wireless communication, and industrial applications), and 2) infrastructure software (a explosive leap with the acquisition of VMware).

Highlights: VMware acquisition brings significant revenue to Broadcom; AI as a great growth driver; jumbo acquisition resulted in gigantic net debt; strong cash generation; 10-for-1 stock split on July 15.

Future: Next-generation products include Tomahawk and Jericho; supply chain disruption and inflation resulted from macroeconomic environment; regulatory scrutiny into VMware acquisition.

Starbucks: A Brewing Crisis - App Economy Insights [Link] [LinkedIn]

Three main issues:

  1. the boycott impact: losing 1.5M loyal customers, as a result of the fact that in October 2023, Starbucks became embroiled in a controversy related to the ongoing violence in the Middle East,
  2. significant loss in traffic from non-Rewards members due to additional reasons such as awareness of daily drink price, gourmet coffee boom, health-conscious consumers, changing work habits, and competitors with coffee offerings in lower prices. Solutions on this are new initiatives such as physical and digital enhancements like updated POS system, siren system speedup, and opening mobile orders beyond its loyalty programs.
  3. Price war in Chinese coffee market e.g. Luckin Coffee.

7 Mindsets That Are Slowing Down Your Career Growth - The Caring Techie Newsletter [Link]

  1. Solo Contributor Mindset -> Prioritize get thing done with others
  2. That’s not my job -> willing to do things outside of my scope
  3. My work will speak for itself -> do the work and say that I did the work
  4. If I do what I’m told, I will get promoted -> I need to sit in the driver’s seat of my career growth
  5. If I’m not getting any feedback, it means I’m doing good -> I need to actively seek feedback
  6. I’m not ready for the next level -> I might be ready for a promotion despite my doubts
  7. Picking the devil you know -> next promotion might come from joining another company

Mark Zuckerberg and Peter Thiel - Internal Tech Emails [Link]

Peter Thiel and Mark Zuckerberg on Facebook, Millennials, and predictions for 2030.

Google owes its stable position as much to Generative AI’s slow progress as its own innovations. While OpenAI, Anthropic, Meta, and others have built more powerful AI models into their chatbots, people haven’t substituted those bots for traditional search. As of February, Bing still had less than 4% of search market share worldwide compared to Google’s 91%. ChatGPT, for context, debuted nearly two years ago.

This week, when OpenAI introduced its own search engine, called SearchGPT, it didn’t exactly strike fear in the halls of Mountain View.

― Surprisingly, Google Is Thriving In The GenAI Era - Big Technology [Link]

Netflix: Ad Tech Focus - App Economy Insights [Link]

Tesla: Robotaxi Delay - App Economy Insights [Link]

Analysis:

  • Tesla’s revenue comes from three main sources 1) automative (78% revenue), 2) services and other (12% revenue), 3) energy generation and storage (10% revenue).
  • Production and Deliveries are the two main metrics.
  • Tesla’s margins have historically been ahead of other car manufacturers thanks to three critical leverages: 1) Economies of scale (though gigafactories), 2) Direct-to-consumer (online and via its showrooms), 3) Low marketing costs (Tesla barely spends on advertising).

Highlights:

  • Tesla missed earnings expectations for the fourth consecutive quarter.
  • Elon Musk pushed the Robotaxi announcement from August 8 to October 10.
  • Profits fell for the second straight quarter, driven by slower demand, competition, and price cuts. Price cut remain a double-edged sword.
  • Operating margin declined by 3% YoY and was at its lowest in years. Negative impacts are from 1) price cuts, 2) delivery decline, 3) AI projects, 4) restructuring costs. Positive impacts are from 1) lower cost per vehicle, production ramp of 4680 cells, higher regulatory credits, and non-auto segments.
  • Energy generation and storage doubled.

Future:

  • Humanoid Robots (Optimus): Tesla will begin producing humanoid robots for internal use next year and plans to sell to other companies in 2026.
  • Market Share and BYD: Tesla outsold BYD in Q2 2024, but the gap between the two companies was only 18K deliveries. Tesla had a 50% market share in BEV sales in the US, with 164K deliveries in Q2. As expected, the market share of BEVs has consistently declined, reflecting the continued adoption of all-electric cars.

More than 1.5 million developers are now using Gemini across our developer tools.

Waymo’s served more than 2 million trips to-date and driven more than 20 million fully autonomous miles on public roads. Waymo’s now delivering well over 50,000 weekly paid public rides, primarily in San Francisco and Phoenix.

Our AI-driven profit optimization tools have been expanded to performance max and standard shopping campaigns. Advertisers use profit optimization and smart bidding see a 15% uplift in profit on average compared to revenue-only bidding.

Soon we’ll actually start testing Search and Shopping ads in AI overviews for users in the US, and they will have the opportunity to actually appear within the overview in a section clearly labeled as sponsored.

― Google: AI Spending Spree - App Economy Insights [Link]

Highlights of Q2 FY24: 1) Revenue growth slowed down, 2) search advertising showed no slowdown, 3) YouTube Ads growth slower than Q1, 4) subscriptions decelerated from Q1 due to YouTubeTV increased its price in Q2 FY23, 5) cloud accelerated, 6) margin improved YoY but are about to compress due to AI investments etc, 7) Capex were up and expected to continue being up, 8) Alphabet committed \(\$5\) B to the ongoing operations of Waymo, 9) The company returned \(\$18.2\) B to shareholders, including \(\$15.7\) B in buybacks, showing their confidence in stock value.

Highlights of Cookies, Cloud, and YouTube: 1) planned to phase out third-party cookies from Chrome to address privacy concerns regarding tracking but reversed its decide to let users choose their tracking preferences, 2) AI boost continues to accelerate cloud revenue growth especially in GCP and Workspace, 3) YouTube gains market share, 9.9% in Jun, up from 9.2% in prior year.

Future: 1) Project Astra, 2) SearchGPT competition: Search is critical for Alphabet because it contributes 57% revenue. SearchGPT could shake up the market but challenges are ensuring accuracy and avoiding hallucination. OpenAI doesn’t have either user engagement or ad performance which are required by a successful search business.

American Express had that network because of its legacy traveler’s check business so it was able to leverage that network to create and establish its credit card business. Without such a network, it’s impossible to operate a closed loop system.

― I Am Buying American: American Express - Capitalist Letters [Link]

Why American Express is superior than Visa and Mastercard? It’s business model.

Visa is a typical payment processor. It connects the merchant to the issuer bank. It’s an open loop. American Express, on the other hand, is a closed loop system which makes it a money printing machine. It uses two strategies: 1) set stricter standards to issue cards, 2) provides travel privileges to attract frequent travelers who have higher net worth.

Why good investment: 1) Giant moat due to closed loop system, 2) inflation proof: customer base are those with stronger purchasing power, 3) it’s expanding internationally and among younger people: in 2023, 60% of new consumer accounts were Gen-Z and Millennial, international businesses billed for card services grew 14% YoY last quarter, accounting for 35% overall growth.

How Github grows and makes money - Productify by Bandan [Link]

Github’s culture values: 1) Customer-obsessed, 2) Ship to learn 3) Growth mindset, 4) Own the outcome, 5) Better together, 6) Diverse and inclusive.

How does Github make money: 1) Al powered tools - Github CoPilot, 2) Subscription Plans, 3) Enterprise solutions, 4) Marketplace and additional services - Github Marketplace, Github Actions, Github Packages.

Revenue Breakdown: Major contributors are Github CoPilot and Enterprise solutions, Steady contributors are subscription plans, growing segments are marketplace and additional services.

Github’s product and engineering culture: 1) Open source, 2) Remote first prioneers - pull requests, 3) Octocat obsession, 4) Continuous learning and growth, 5) Al integration - Github Copilot, 6) Hackathons and innovation time, 7) Inclusive design.

Key Takeaways from Github’s growth strategy: 1) Unwavering developer-centric focus and positioning, 2) Building relevant products for its user problems, 3) strong cultural values and community engagement

No Rules Rules - The secret sauce of Netflix - Tech Books [Link]

How to win at Enterprise AI - A playbook - Platforms, AI, and the Economics of BigTech [Link]

YouTube and Podcasts

Hot Swap growing, donors revolt, President Kamala? SCOTUS breakdown: Immunity, Chevron, Censorship - All-in Podcast [Link]

they’re probably two of the key things that I would look at to determine are we looking at a a true luxury business and then you can go into all um uh kinds of detail um but I think they would be the two ones I’d look at um in terms of the experience from the customer point of view I think it’s also important to remember that there needs to be a social (06:37) element um to the product or service for it to be a luxury in in the in a commercial sense and in a sort of financial um you know investor sense um and so the idea there is if you look at many Artisans or makers of high quality bespoke Goods um you know that they may very well be high quality product but there’s no social Dimension right so there’s no element of showing off if you will to use a slightly sort of negative connotation and so therefore I think in a in in this for the purposes of our discussion The Artisans and the and the (07:14) sort of small independent bespoke makers would not be considered luxury businesses right um so there would probably be two or three things I would look at clay to figure out if I’m looking at a true luxury business um and then finally the the fact that these businesses and the market overall tends to be driven more by the offering than the demand side so in some sense these companies create you know their own Market they create their own Demand by offering things to the consumer that the consumer may not realize they they need or desire or even um or or dream of right so there there’s a number of unique characteristics to these businesses which I think you know make them very interesting to study sometimes it’s difficult to determine if you’re looking at a true luxury business or not and sometimes Within These large groups take an lbmh for instance you know some of their offering for some of their brands take Cosmetics or perfumes some of that offering I probably wouldn’t classify as a luxury business right but there are still part (08:16) of um the group and and they generate some revenues at group level um and then you have some parts of the business say say the LV or Dior Brands where it’s and especially leather goods and apparel where it’s much easier to say that this is a this operates as a true luxury business so you know you can attempt to draw these distinctions but I think sometimes the lines are blurred。

The Luxury Strategy | Why LVMH & Hermès have Outperformed the Market w/ Christian Billinger (TIP643) [Link]

Simple Diffusion Language Models (15min video) [Link]

You cannot spend this kind of money and show no incremental revenue potential. So while this is incredible for NVIDIA, the chicken is coming home to roast, because if you do not start seeing revenue flow to the bottom line of these companies that are spending 26 B dollars a quarter, the market cap of NVIDIA is not what the market cap of NVIDIA should be, and all of these other companies are going toe get punished for spending this kind of money. Where are all these new fangled things that we are supposed to see that justifies a hundred billion dollar of chip spend a year, two hundreds billion dollars of energy spend, a hundred billion dollars of all this other stuff, we are now spending 750 billion dollars. This is on the order of a national transfer payment, and we’ve seen nothing to show for it except that you can mimic somebody’s voice. It doesn’t all hang together yet. - Chamath Palihapitiya

There’s gaps in the quality of the products that can be created to not have hallucination. Those gaps are too large right now for them to be used reliably in production settings unless you have a very defined scope. If you have a defined scope though, the implementation costs are not nearly what needs the level of spend to support. So there is just a big mismatch. Second is that we have a huge problem with NVIDIA, which is you can’t spend this kind of money to have tech lock-in to one hardware vendor, and that makes no sense. And what you are seeing now is that Amazon Google Microsoft AMD Intel, a plethora of startups Grok, everybody trying to make now different hardware solution. The problem though is that we have this massive lock-in right now, because the code is littered with all these NVIDIA specific ways of implementing access to GPUs, that’s not sustainable. So we are in an existential thrash and I think the only way that we are going to get around this is to do a little bit of a reset. And I think that’s going to touch a lot of startups that have already taken down way too much money at really insane valuations. I think we are in a bit of a reckoning right now it’s going to be complicated couple of quarters to at a minimum and probably a complicated year to sort out who’s actually real. - Chamath Palihapitiya

There is ton of capital that was raised during the covid bubble era, and the ZIRP (Zero Interest Rate Policy) era, that needed a place to go. And a lot of traditional business model, traditional in the technology sense - SAS and a lot of biotech stuff, it became uninvestable. Then there is a lot of money in the public markets that was sitting on the sidelines, that was sitting in treasuries and so on. So every dollar is looking for growth and there is a lot of dollars still sitting around out there from the ZIRP era and the coming into this kind of post ZIRP era, looking for a place to growth. And there is very little growth as we talked about with the S&P 479 not being very performative with respect to growth and revenue and having great outlook for the next five years. So then when there is a glimmer of upside there is a glimmer of opportunity, even if it’s just painting a picture of a growth story, all the capital drives into it. And we’ve all heard stories about these series a startups in AI, getting bit up to a 100M valuation. I’ve seen a couple of these where people have pitch me things on like protein modeling AI startups, and it’s literally like two guys from meta and openai that left and started this company, and they raised 30 on a 12 per year or something, and it’s just two guys building a model. That’s because that capital needs to find a place where it can tell itself a growth story. So I think we are still dealing with the capital hangover from ZIRP. And the fact there is an area to invest for real growth that has allowed the AI bubble to grow as quickly as it has. - David Friedberg

Now as Chamath points out we are kind of rationalizing that back and I do think that there is going to be a reset. Now I’ll also say that the Goldman report which I read and some of the other analyses that have been done. I think there was some commentary or some analysis that hey it costs me six times as much as having an analyst do this work. The energy cost of the AI is still so high, the actual performance of the model is not good. What that fails to write it’s right and wrong. It’s right in the sense that yes it’ s more expensive today and ROI is not there today. It’s wrong in that it ignores the performative model improvements that we’re seeing in nearly any metric over the past couple of months. Every few months as we know we see new models, new improvements, new architectures, new ways to leveraging the chips to actually drive a lover token cost, to drive lower energy cost per answer, lower energy cost per run. Every metric that matters is improving, so if you fast forward another 24 or 36 months, I do think that there is a great reason to be optimistic that there is going to be extraordinary ROI based on the infrastructure that’s being built. It’s a question of are you going to get payback before the next cycle of infrastructure needs to be made and everything comes back in. We saw this during the dotcom boom where a lot of people built out data centers and by the time they were able to actually able to make money on those data centers, it was like hey all the new Telco equipment, all the new servers needs to be put in, and everything got written off. So there is a big capex kind of question mark here, but I do think that the fundamental economics of AI will be proven over the next couple of quarters. - David Friedberg

I’m much more bullish than you guys about this investment that’s being made. Remember that when the internet got started in the 90s, it was via dialup. I mean you literally had to have a modem and you would dial up the internet and it was incredibly slow. Photo sharing didn’t even work, so social networking wasn’t possible. And basically what happened next was that the Telecom company spent a ton of money building out broadband and people started upgrading to broadband. Then we had the Doom crash everyone thought that telecom companies had wasted billions of dollars investing in all this Broadband infrastructure. And it turned out that no they were right to do that, it just took a while for that to get used and this is a pretty common pattern with technological revolutions is that you can have a bubble in the short term but then it gets justified in the mid to long term. The build out of the railroads in the United States another example of this we had huge railroad bubbles but it turned out that that investment was all worthwhile. - David Sacks

― Biden chaos, Soft landing secured? AI sentiment turns bearish, French elections - All-in Podcast [Link]

Project 2025: The Radical Conservative Plan to Reshape America Under Trump | WSJ [Link]

Trump assassination attempt, Secret Service failure, Inside the RNC, VC liquidity problem - All-in Podcast [Link]

Trump’s VP pick JD Vance SPEAKS at 2024 RNC (FULL SPEECH) - NBC Chicago [Link]

You have to put one foot in front of the other every day, and you have to focus on tangible progress. And where that fails is when most people and I do it a lot and I’ve tried to get better as I’ve gotten older, is when I get comparative and I compare myself to the other person, the other company, the other funding round, there’s so many reasons for you to feel like you’re less than something else. And the reality is that has nothing to do with you, you’re not in control of that, but it’s so hard. And then if I don’t take that medicine, I become insecure, and then I make mistakes that are entirely avoidable. So it’s just tangible progress the things that I can control. That’s probably the most useful piece of advice that I try to remind myself of every day. - Chamath Palihapitiya

― The Besties Take Napa | All-In Special - All-in Podcast [Link]

Sharing good insights about AI, David’s amazing story with Poker, some great career advice. And happy birthday to David Friedberg!

We talked a little bit about it with Jonathan height. There’s some great studies that have shown in the past that the change in income is a better predictor of happiness than absolute income. Eventually everything normalizes so I think UBI makes no sense for three reasons. The first is this normalization of spending level. So once you’ve kind of had this increase, you have a moment of happiness, and then you actually start spending differently or spending more. And effectively every human has one innate trait desire. And desire is what drives humanity. It’s what drives progress. It’s what pushes us forward because no matter what our absolute condition, it’s our relative condition that matters relative to others or relative to ourselves in the past or perspectively in the future. And so we always want to improve our condition. So a UBI based system basically gives a flat income so the only way for it to really work is if you increase the income automatically by say 10% a year. So in a UBI world, no amount of money will actually make someone satisfied or meet their minimum thresholds because those minimum thresholds will simply shift. And you know the second issue is just the net economic effect if we gave 350 million Americans 1000 bucks a month, that’s \(\$350\) billion a month, that’s \(\$4\) trillion a year. Our prospective budget for next year is 7.3 trillion at the federal level, so you know that’s already more than 50% of the total projected federal budget next year finding the mechanism for funding this at scale is not what this study actually looked at. Because if you look at it the net effect would be inflationary. And that’s the third major reason is that ultimately this would have an inflationary effect anytime. We’ve stimulated the economy with outside money. With government-driven money, we see many bubbles emerge and we see an inflationary effect. So look at covid, there were all these little bubbles that popped up in the financial markets, we had NFTs, we had crypto, we had all these sort of new places that money found its way to and then we had an aggregate inflationary effect food prices are still up 30 40% since covid. And so I think that the study provides an interesting insight into the micro effect the psychological effects, the social effect, but macro effects are what is so like simply arithmetically obvious, which is inflation and an inability to actually fund us at scale. And fundamentally people want to work so they’ll take that money, and then they’ll go find ways to work and generate more money, and you have this inflationary effect so I think UBI does not make sense. - David Friedberg

That’s not UBI right and what you’re describing I think exist and there are incentives and programs and opportunities out there people can sign up with Roth IRAs they can contribute some percentage of their paycheck to a 401k. If they have a job that has a 401k setup for them there’s a lot of systems and mechanisms out there and you get tax breaks for doing that. So there’s mechanisms and incentives out there to do that sort of thing the concept with UBI is can you pay people a flat amount of money so that they don’t have to work, and then they end up being able to explore and do other things with their life as the robots and AI does everything for them. And I’ve just always been of the belief that I don’t think that there’s this natural border that we hit beyond which humans don’t work. I think that AI based tools and automation tools are the same as they’ve always been. When we developed a tractor people didn’t stop farming. They could get much more leverage using the tractor and farm more. And new jobs and new Industries emerged. And I expect that the same thing will happen with this next evolution of technology and human progress. Humans will find ways to create new things to push themselves forward to drive things forward. And for the natural market-based incentives that fundamentally are rooted in this internal system of Desire will create new opportunities that we’re not really thinking about so I don’t believe in this idea of UBI in some utopian world where everyone’s happy not working and letting machines do everything for them I think that the fundamental sense of a human is to find purpose, and to realize that purpose to drive themselves forward and progress themselves. And I think that that’s always going to be the case. - David Friedberg

― Mag 7 sell-off, Wiz rejects Google, UBI, Kamala in, China’s nuclear buildout, Sacks responds to PG - All-in Podcast [Link]

Microsoft Volume II - Acquired Podcast [Link]

Blogs and Articles

A year later, what Threads could learn from other social networks - TechCrunch [Link]

Though Threads has reached 175 million monthly active users in its first year and has made some progress such as integrating fediverse, there are a lot of things need to be improved by learning from other social medias.

  1. Custom Feeds: Learning from Bluesky: Threads should implement advanced custom feed features to allow users to easily follow specific topics and events without relying solely on tags.

  2. Third-Party Apps: Learning from Mastodon and Bluesky: Meta should consider opening up Threads to third-party developers to create diverse client applications, allowing for a broader range of user experiences and features.

  3. Algorithm Improvement: Improving “For You” Feed: Threads needs to refine its algorithm to ensure that users receive more relevant and personalized content, avoiding random or irrelevant posts that can detract from user experience.

  4. Handling News and Political Content: Learning from X and Mastodon: Threads should develop mechanisms to handle news and political content more effectively, balancing visibility without suppressing important information, and potentially integrating features like context-providing notes or bylines.

  5. Local Content Engagement: Learning from Instagram and Twitter: Threads should enhance its focus on local content by developing partnerships and features that cater to regional interests and events, like live scores for popular sports in specific regions.

  6. Separation from Instagram: Developing Independent Profiles: Threads should work on allowing users to create and manage profiles independent of Instagram accounts, offering more flexibility and autonomy in account management.

If “product-market-fit” means that you’ve found the right kind of product that the market wants… “Position-market-fit” means that you’ve found the right combination of product/brand/marketing/pricing/go-to-market/sales/etc in a given domain.

― Product-market fit is not enough anymore. You need position-market fit - Aakash Gupta on X [Link]

Product-market fit is about having the right product for the market, while position-market fit is about effectively positioning that product within the market to stand out and meet specific customer expectations.

A discussion of discussions on AI bias - Dan Luu [Link]

How to build a valuable tech company - Jason Shen on X [Link]

Jensen’s Mindmap about his secrets to building the mos tvaluable tech company in the world

jensen-mindmap

As a general rule, don’t let your company start doing the next thing until you’ve dominated the first thing. No great company I know of started doing multiple things at once—they start with a lot of conviction about one thing, and see it all the way through. You can do far fewer things than you think. A very, very common cause of startup death is doing too many of the wrong things. Prioritization is critical and hard.

While great founders don’t do many big projects, they do whatever they do very intensely. They get things done very quickly. They are decisive, which is hard when you’re running a startup—you will get a lot of conflicting advice, both because there are multiple ways to do things and because there’s a lot of bad advice out there. Great founders listen to all of the advice and then quickly make their own decisions.

Please note that this doesn’t mean doing everything intensely—that’s impossible. You have to pick the right things. As Paul Buchheit says, find ways to get 90% of the value with 10% of the effort. The market doesn’t care how hard you work—it only cares if you do the right things.

Fire quickly. Everyone knows this in principle and no one does it. But I feel I should say it anyway. Also, fire people who are toxic to the culture no matter how good they are at what they do. Culture is defined by who you hire, fire, and promote.

― Startup Playbook by Sam Altman - Sam Altman [Link]

A brief summary of Sam’s long article by George from prodmgmt.world on X.

startup-playbook

The primary battleground was data and Al governance.

Snowflake fired the first shot by open-sourcing Polaris, its catalog for Apache Iceberg, a popular open-source table format that’s compatible any compute engine. Databricks countered by announcing its acquisition of Tabular, a managed solution for Iceberg created by the project’s founders, right in the middle of Snowflake’s conference. The tollowing week, at their own summit, Databricks further upped the ante by open-sourcing its Unity catalog in front of a live audience.

Data has gravity, so It’s far more efficient to bring applications and services to data rather than vice versa.

Both Databricks and Snowflake are now vying to build the ultimate enterprise AI platform: one capable of serving as the foundation for this “small-but-mighty” vision of AI. Their shared goal is to become the single source of truth for all of an organization’s data and use this position to power intelligent applications across every business function.

Databricks emerged from the open-source Apache Spark project and initially focused on serving the needs of data scientists and ML engineers. Its big data processing capabilities made it a natural fit for AI and data science workloads. Snowflake, by contrast, built its early success around a SQL-centric architecture and tight integration with BI tools, catering to data analysts and traditional IT departments with a closed, “it just works” solution.

― Databricks vs. Snowflake: What their rivalry reveals about AI’s future - Foundation Capital [Link]

Databricks and Snowflake are fighting for the future of enterprise Al. This article discussed four key concepts that shed light on the competitive dynamics: data gravity, the convergence of analytics and Al, the strategic importance of open source, and the rise of compound Al systems.

How to Interview and Hire ML/ AI Engineers - eugeneyan [Link]

Interviewing Meta CTO Andrew Bosworth on the Metaverse, VR/AR, AI, Billion-Dollar Expenditures, and Investment Timelines - MatthewBall.co [Link]

Spotify is no longer just a streaming app, it’s a social network - TechCrunch [Link]

Gen AI: too much spend, too little benefit? - Goldman Sachs [Link]

Crypto x Al report [Link]

AI’s shift to efficiency [Link]

What is AI? - Everyone thinks they know but no one can agree. And that’s a problem - MIT Technology Review [Link]

The Folly of Certainty - Howard Marks [Link]

On July 19, 2024 at 04:09 UTC, as part of ongoing operations, CrowdStrike released a sensor configuration update to Windows systems. Sensor configuration updates are an ongoing part of the protection mechanisms of the Falcon platform. This configuration update triggered a logic error resulting in a system crash and blue screen (BSOD) on impacted systems. The sensor configuration update that caused the system crash was remediated on Friday, July 19, 2024 05:27 UTC. This issue is not the result of or related to a cyberattack.

― Technical Details: Falcon Content Update for Windows Hosts [Link]

In any massive failure there are a host of smaller errors that compound; in this case, CrowdStrike created a faulty file, failed to test it properly, and deployed it to its entire customer base in one shot, instead of rolling it out in batches. Doing something different at each one of these steps would have prevented the widespread failures that are still roiling the world

The real issue, though, is more fundamental: erroneous configuration files in userspace crash a program, but they don’t crash the computer; CrowdStrike, though, doesn’t run in userspace: it runs in kernel space, which means its bugs crash the entire computer — 8 million of them, according to Microsoft. Apple and Linux were not impacted, for a very obvious reason: both have long since locked out 3rd-party software from kernel space.

― Crashes and Competition - Ben Thompson on Stratechery [Link]

The Munger Series - Learning from Benjamin Franklin - Investment Master Class [Link]

How Benjamin Graham Survived World Panic on Wall Street (#17) - Beyond Ben Graham [Link]

Introducing Llama 3.1: Our most capable models to date - Meta AI Blog [Link]

Meta is releasing Llama 3.1 405B, the first frontier-level open-source AI model. Along with Llama 3.1 70B and 8B models, they offer superior cost / performance and are open for Fien-tuning and distilling. And they are collaborating with companies such as Amazon, Databricks, NVIDIA, Grow, etc, to support developers in fine-tuning and distilling models.

Open Source AI Is the Path Forward - Meta News [Link]

In this letter, Zuckerberg emphasizes Meta’s commitment to open source AI. Similar to Unix and Linux, Zuckerberg believes AI development will eventually go to open source. Open source AI has several benefits: 1) it benefits developers in customization, control and security, cost efficiency, and long-term standards, 2) it benefits Meta in avoiding being locked into competitor’s ecosystems, allowing for freedom in innovation and product development, enhancing its competitiveness, and building a community of partnerships and developers, 3) it benefits the world in providing wide spread access to AI benefits, ensuring safety and security, and avoiding monopoly in AI power.

GPT-4o mini: advancing cost-efficient intelligence - OpenAI [Link]

Paper and Reports

Meta 3D Gen - Meta AI [Link]

AI Agents That Matter [Link]

This study suggests the importance of optimizing both cost and accuracy in benchmarking and evaluation of AI agents. Since the issues of inadequate hold-out sets, absence of standardized evaluation practices, etc, the authors also suggests a principled framework that emphasizes the development of agents effective especially in practical scenarios rather than on benchmarks.

Scaling Synthetic Data Creation with 1,000,000,000 Personas [Link]

This team generated 1B personas based on web info and stored them into a Persona Hub. They introduced a synthetic data generation method called ‘persona-driven data synthesis’. These personas can be potentially used to 1) generate personalized content, 2) support LLM prompting, 3) enhance product research, 4) create NPCs in games. The compression perspective is more interesting and helpful for understanding the approach: Persona Hub can be seen as the compressed form of the world knowledge into distributed carriers. And the public web text can be seen as the decompressed content created by these personas with their knowledge and experiences.

TextGrad: AutoGrad for Text [Link]

RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing [Link]

A Survey on Mixture of Experts [Link]

This is a comprehensive survey on LLM MoE technique. MoE stands out for enabling model scaling with minimal additional computation. This survey as a systematic MoE literature review, covers MoE’s structure, taxonomy, core designs, open-source resources, applications, and future research directions.

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2 [Link]

Magic Insert: Style-Aware Drag-and-Drop - Google [Link]

PaliGemma: A versatile 3B VLM for transfer [Link]

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision [Link]

FlashAttention-3: achieves a 1.5-2x speedup and reaching up to 740 TFLOPS on FP16 and nearly 1.2 PFLOPS on FP8. This increases GPU utilization to 75% of the theoretical maximum on H100 GPUs, up from 35%.

FlashAttention-3 introduces three main techniques to boost performance:

  1. Overlapping computation and data movement
  2. Interleaving matrix multiplications (matmul)
  3. Softmax operations, and using low-precision FP8

Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs [Link]

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures [Link]

Accuracy is Not All You Need [Link]

AI achieves silver-medal standard solving International Mathematical Olympiad problems - Google Research [Link]

This is one of the most surprising breakthrough in this AI and LLM year. AlphaProof got a silver medal in IMO. It’s a neurosymbolic system - a combination of Google’s Gemini LLM and DeepMind’s Alpha Zero, so it uses LLM to generate plausible solutions and uses self-play style to search for the right one. It opens a research direction of AI use cases, which has been discussed about by many AI frontier experts and companies, which is “scientific discovery”. Mastering math can be the first step of expanding frontier of our knowledge. OpenAI’s Strawberry project seems to have the same ambitions.

Gen AI: Too Much Spend, Too Little Benefit? - Goldman Sachs Research Newsletter [Link]

As money’s flooded into GenAI projects, people started to question whether or when the investment would net a return. Though the bubble may or may not be bursting, a healthy discussion like this is worth a read.

News

Prices fell in June for the first time since the start of the pandemic - CNN [Link]

CPI dropped 0.1% from May. Odds of Fed cutting the rate are increasing. Effect of high inflation is expected to be long lasting.

Here’s how far the Dow has fallen behind the S&P 500 so far in 2024 - Morningstar [Link]

Tech Giants Face Tough Task to Sustain Second Half Stock Rally - Bloomberg [Link]

Magnificent 7 stocks have accounted for majority of the S&P 500 growth this year. If this projection of AI optimism fails to materialize, it could trigger a massive decline of the index.

Apple Poised to Get OpenAI Board Observer Role as Part of AI Pact - Bloomberg [Link]

Microsoft, Apple Drop OpenAI Board Plans as Scrutiny Grows - Bloomberg [Link]

This is Big Tech’s playbook for swallowing the AI industry - The Verge [Link]

Amazon Hires Top Executives From AI Startup Adept for AGI Team - Bloomberg [Link]

Big Tech companies are finding new ways to integrate AI startups into their operations without triggering antitrust scrutiny - ‘reverse acquihire’, an approach where actual acquisitions are masked by employment and licensing agreements. This is highlighted by Microsoft hiring inflection’s team and licensing of its AI tech, and Amazon hiring roughly 2/3 of Adept’s personnel and securing a deal to license its AI tech.

What happened to the artificial-intelligence revolution? - The Economics [Link]

Silicon Valley companies are investing heavily in AI while the revenue from AI products is still far from the projected figures.

Humanoid robots powered by AI turn heads at the World Artificial Intelligence Conference - AP News [Link]

Record 300,000 visitors attend World AI Conference [Link]

The World AI Conference and High-level Meeting on Global AI Governance (WAIC) 2024 closed in Shanghai on Saturday, covering investment plans, cooperation projects, city-level organizations, and development plans for AI. Robotic tech such as humanoid models is capturing the attention of attendees.

Fame, Feud and Fortune: Inside Billionaire Alexandr Wang’s Relentless Rise in Silicon Valley - The Information [Link]

Robinhood snaps up Pluto to add AI tools to its investing app - TechCrunch [Link]

The AI tool Pluto will allow Robinhood to add tools for quicker identification of trends and investment opportunities, help guide users with their investment strategies, and offer real-time portfolio optimization.

“The algorithm is looking at traditional economic indicators that you would normally look at. But then inside of our proprietary algorithm, we’re ingesting the behavioral data and transaction data of 240 million Americans, which nobody else has,” said David Steinberg, co-founder, chairman and CEO of Zeta Global.

The eight verticals the economic index uses include automotive activity, dining and entertainment, financial services such as credit line expansion, health care, retail sales, technology and travel.

― A new index is using AI tools to measure U.S. economic growth in a broader way - CNBC [Link]

The Zeta Economic Index uses Gen AI to analyze “trillions of behavioral signals” to score growth of US economy.

OpenAI Hires Zapier Revenue Chief to Lead Sales Strategy - The Information [Link]

OpenAI has recently hired new CFO and CPO to enhance its focus on both consumer and enterprise products. It appointed Giancarlo Lionetti (former CRO at Zapier, worked at Atlassian, Confluent, and Dropbox) to lead its sales strategy in OpenAI’s sales team.

Tesla’s Share of U.S. Electric Car Market Falls Below 50% - The New York Times [Link]

Tesla’s Upcoming Model Y, Project Juniper, Spotted with Front Bumper Camera; Coming in 2025 [Link]

Persona’s founders are certain the world can use another humanoid robot - TechCrunch [Link]

Thermonuclear Blasts and New Species: Inside Elon Musk’s Plan to Colonize Mars - The New York Times [Link]

OpenAl says there are 5 ‘levels’ for AI to reach human intelligence - it’s already almost at level 2 [Link]

The reason we decided to do the 100k H100 and next major system internally was that our fundamental competitiveness depends on being faster than any other AI company. This is the only way to catch up. Oracle is a great company and there is another company that shows promise also involved in that OpenAI GB200 cluster, but, when our fate depends on being the fastest by far, we must have our own hands on the steering wheel, rather than be a backseat driver. - Elon Musk @ X

― xAI Appears to Confirm Ended Talks With Oracle Over Expanded AI Chips Agreement - WSJ [Link] [X]

Elon’s business strategy - being completely vertical integrated, on many of his companies (Tesla, SpaceX, etc) are working very well over the years.

Venture capital firm A16z stashing GPUs, including Nvidia’s, to win AI deals: report - Seeking Alpha [Link]

A16z has purchased thousands of GPUs including Nvidia’s H100, in an effort to win deals for AI startups. They store those H100s and give them to companies they invest in. It’s hard for startups to get vast amounts of computing power. So this practice can make them more competitive in these VC deals.

OpenAI and Los Alamos National Laboratory announce bioscience research partnership - OpenAI [Link]

OpenAI and LANL are working together on evaluating how frontier models like GPT-4o can assist humans in physical lab setting through multimodal capabilities to support bioscience research.

In response to the fourth question in the investor call transcript, Furukawa said the following (obtained via machine translation and edited for clarity):

“In the game industry, AI-like technology has long been used to control enemy character movements, so I believe that game development and AI technology have always been closely related.

Generative AI, which has been a hot topic recently, can be more creative [in its use], but I also recognize that it has issues with intellectual property rights.

Our company has [had] the know-how to create optimal gaming experiences for our customers for decades.

While we are flexible in responding to technological developments, we would like to continue to deliver value that is unique to us and cannot be created simply by technology alone.”

― Nintendo becomes the biggest company in the games industry - and maybe the world - to say ‘no, thank you’ to using generative AI - PC Gamer [Link]

Most gaming companies would like to incorporate AI in some sense but Nintendo as the biggest company in the game industry said no thank you to Gen AI. This sounds counter to what other game companying are aiming for, but it’s also reasonable because Nintendo has built incredible IP and they just want to be classic and they want everything to be their own.

However, many people have imagined the future of video game would be powered by AI with contents dynamically created for players in real time.

Watch a robot navigate the Google DeepMind offices using Gemini - TechCrunch [Link]

Google DeepMind Robotics developed a robot navigation system powered by Gemini 1.5 Pro. It responds to human language commands, navigates the office environment. It uses “Multimodal Instruction Navigation with demonstration Tours (MINT)” to familiarize itself with the office and hierarchical Vision-Language-Action (VLA) for understanding and reasoning. The ability of recalling environment is boosted by 1M token context length of Gemini 1.5 Pro

OpenAI Scale Ranks Progress Toward ‘Human-Level’ Problem Solving - Bloomberg [Link]

OpenAI tiers range from the kind of AI that can interact in conversational language with people (lvl 1) to AI that can do the work of an organization (lvl 5). The OpenAI executives believes that they are at stage one and reaching towards the second tier. The third tier on the way to AGI would be ‘Agents’ - AI systems which can spend several days taking actions on a user’s behalf. Tier 4 would be the kind of AI that can come up with innovations. And the tier 5 would be called ‘organization’.

Samsung’s Jam-Packed Galaxy Unpacked: Galaxy Ring, Z Fold 6 and All the New Products Announced [Link]

Among the 35 companies approved to test by the California DMV, seven are wholly or partly China-based. Five of them drove on California roads last year: WeRide, Apollo, AutoX, Pony.ai, and DiDi Research America. Some Chinese companies are approved to test in Arizona and Texas as well.

― Chinese self-driving cars have quietly traveled 1.8 million miles on U.S. roads, collecting detailed data with cameras and lasers - Fortune [Link]

Since 2017, self-driving cars owned by Chinese companies have traverse 1.8M miles of California alone. They captured video of their surroundings and map the state’s roads to within 2 cm of precision. These information have been transferred to data centers and been used to train their self-driving systems.

Evaluate prompts in the developer console - Anthropic News [Link]

Anthropic releases some new features every week. Now they allow users to generate, test, and evaluate prompts in the Anthropic Console.

Fine-tune Claude 3 Haiku in Amazon Bedrock - Anthropic [Link]

Customers can now fine-tune Claude 3 Haiku in Amazon Bedrock to customize model for vertical business usage.

Shooting at Trump Rally Comes at Volatile Time in American History - The New York Times [Link]

This is crazy but legendary.

Insurers Pocketed $50 Billion From Medicare for Diseases No Doctor Treated - The Wall Street Journal [Link]

UnitedHealth Group committed a $50 billion fraud over the three years of 2019, 2020, and 2021. Though treating doctors say “no treatment or minimal treatment necessary for this diagnosis”, UnitedHealth overrides the docstors’ judgment, generates its own diagnosis code, bills medicare with this new code.

Thousands of Windows machines are experiencing a Blue Screen of Death (BSOD) issue at boot today, impacting banks, airlines, TV broadcasters, supermarkets, and many more businesses worldwide. A faulty update from cybersecurity provider CrowdStrike is knocking affected PCs and servers offline, forcing them into a recovery boot loop so machines can’t start properly. The issue is not being caused by Microsoft but by third-party CrowdStrike software that’s widely used by many businesses worldwide for managing the security of Windows PCs and servers.

― Major Windows BSOD issue hits banks, airlines, and TV broadcasters - The Verge [Link]

That from Christopher Thornberg who heads a California-based consulting firm called Beacon Economics. He says moving a main office like this out of state would likely mean anywhere from dozens of lost jobs to a couple hundred, not thousands of jobs lost.

Governor Newsom’s press office took to X after Musk made the announcement comparing California to Texas saying, “The last time Elon Musk moved an HQ, Tesla ended up expanding in California, even relocating their Global Engineering and AI headquarters to California because of diverse, world leading talent.”

― What Elon Musk’s Texas relocation plan for SpaceX, X HQs could mean for CA - ABC7 News [Link]

SearchGPT Prototype - OpenAI News [Link]

Substack

Salesforce: Worst Day in 20 Years - App Economy Insights [Link]

PayPal hired Mark Grether who was head of Uber’s ad business to lead the initiative of an ad network.

Costco’s membership fees declined from 86% to 50% of the its operating profit. It has shown economies of scale and benefits from a more favorable revenue mix.

Salesforce’s revenue grew 11% to \(\$9.1\)B, missing Wall Street estimates by \(\$20\)M. Current Remaining Performance Obligations - the best indicator of future growth - rose 10%, missing estimates of 12%. The slowing growth is partially due to broader macroeconomic challenges and internal execution issues. Salesforce Data Cloud is contributing to 25% of the deals valued above $1M, indicating it’s well-positioned to benefit from AI boom.

Live Nation has caused such widespread outrage in 2022 Taylor Swift Eras Tour ticket sales because fans faced technical glitches and exorbitant fees. Live Nation was accused of locking venues into exclusive deals and bullying artists into using its services, which caused higher ticket prices through service and convenience fees. Live Nation is under scrutiny by the government. It is forced to divest Ticketmaster (acquired in 2020). Fans / artists are expecting increased competition in live music industry and lower prices, and a smoother ticket buying experience.

Online Travel: AI is Coming - App Economy Insights [Link]

AI agents as the next frontier could make traveling personalized. The key metrics to define their success are 1) gross bookings, 2) nights booked, 3) average daily rate (ADR), 4) revenue per available room (RevPAR), customer acquisition cost (CAC).

The largest travel companies (online travel agencies and rentals and hotel chains) are Booking Holdings, Airbnb, Expedia, Marriott, and Hilton.

Highlights: 1) Booking Holdings (operating as an OTA): CEO Glenn Fogel envisions AI enhancing connected trips (single booking that include multiple travel elements such as flights, accommodations, car rentals, etc), 2) Airbnb exceeded expectations on both revenue and profitability in Q1 due to its robust international expansion, while slowing down the growth in North America. Airbnb is aiming to create an AI powered concierge to elevate the overall Airbnb experiences, 3) Expedia (operating as an OTA): Expedia is currently facing challenges of transition and adjustment: Vrbo vacation rental platform integration into Expedia platform is slower than expected. And it’s also facing challenges in attracting and retaining customers in its B2C segment. A new CEO Ariane Gorin was recently appointed to help navigate through these challenges. 4) Marriott (operating as booking platform): Marriott has developed Homes & Villas tool that allows users to search for vacation rentals using language. A slow-down RevPAR in North America has been observed which indicates a shift in consumer preferences towards international destinations and alternative accommodations. Its brand reputation, loyalty program and focus on group/business travel remain strong. 5) Hilton: has strong emphasis on personalization and loyalty programs though facing headwinds in the US. CEO Chris Nassetta envisions AI-powered tools to address guest concerns in real time.

From graphics rendering, gaming and media, cloud computing and crypto, Nvidia’s chips have led the way in each of these past innovation cycles as it saw its GPU applications expand over the last 2 decades. And now it is getting ready to advance the next industrial revolution, that will be powered by AI.

Some industry experts believe that 20% of the demand for AI chips next year will be due to model inference needs, with “Nvidia deriving about 40% of its data center revenue just from inference.”

― NVIDIA’s chips are the tastiest AI can find. It’s stock still has ways to go - The Pragmatic Optimist [Link]

This is a good summary of Nvidia’s strategies towards computing, path to AI domination, tailwinds of efficiency, position in the future.

Nvidia is “at the right place at the right time”:

  • During 2000-2010 where the world successfully emerged from the dot-com bust, demand of Nvidia’s GPUs increased as the proliferation of games and multimedia applications. By 2011, Nvidia had already begun to reposition the company’s strategy for GPU chips towards mobile computing. At the same time, the concept of cloud computing, crypto-mining, and data center started to form.
  • Nvidia has built grounded relationship with academics and researchers. According to the paper published by Andrew Ng to show the power of NVIDIA GPU. During 2011-2015, Ng had been working as the Chief Scientist at many big tech firms and deployed data center architectures based on Nvidia’s GPUs. During 2010-2014, data center and HPC grew at a compounded growth rate of 64%. This period of time was one of the moments that set Nvidia on the course to dominate AI.

In semiconductor industry, there are two different ways of manufacturing chips at scale:

  1. Designing and manufacturing your own chip - what Intel was doing. Manufacturing chips can be very expensive and hard. Today, chip foundries such as Taiwan’s TSMC and South Korea’s Samsung are able to maintain leading edge.
  2. Designing and producing powerful chips at a quicker pace by partnering with chip foundries like TSMC - what Nvidia and AMD fabless companies are doing.

2024 could be the first year that Huang’s Nvidia could cede some market share to AMD. AMD launched their own MI300-series and Intel launched their Gaudi3 AI Accelerator chip, aiming to get back share from Nvidia’s H100/H200 chips. However Huang looks ahead:

  • Huang believes Nvidia must turn its attention to the next leg of AI - Model Inference.

    Tech companies spend more on AI data center equipment over years, Nvidia’s revenue won’t slow down.

  • Nvidia’s executives also believe that company can benefit from demand from specific industry verticals, such as automotive.s

    Tesla, for example, is leading self driving cars.

  • Automotive AI and Sovereign AI are two future areas of growth where enterprises continue to spend on data centers for model training and inferencing.

The authors also assessed Nvidia’s valuation and believe that:

  • Between 2023 and 2026, Nvidia’s sales should be growing at a compounded annual growth rate of 43–45%.
  • Over the next 3 years, they expect operating profit to grow in line with revenue growth, with operating profit margins remaining relatively flat in 2025 and 2026.

The Coming Wave of AI, and How Nvidia Dominates - Fabricated Knowledge [Link]

Nvidia is the clear leader in 1) System and Networking, 2) Hardware (GPUs and Accelerators), and 3) Software (Kernels and Libraries) but offers the whole solution as a product.

Amazon drives tremendous savings from custom silicon which are hard for competitors to replicate, especially in the standard CPU compute and storage applications. Custom silicon drives 3 core benefits for cloud providers.

  1. Engineering the silicon for your unique workloads for higher performance through architectural innovation.
  2. Strategic control and lock-in over certain workloads.
  3. Cost savings from removing margin stacking of fabless design firms.

The removal of these workloads from server CPU cores to the custom Nitro chip not only greatly improves cost, but also improves performance due to removing noisy neighbor problems associated with the hypervisor, such as shared caches, IO bandwidth, and power/heat budgets.

― Amazon’s Cloud Crisis: How AWS Will Lose The Future Of Computing - Semianalysis [Link]

A good overview of Amazon’s in-house semiconductor designs (Nitro, Graviton, SSDs, Inferentia, and Trainium). It includes how Microsoft Azure, Google Cloud, Nvidia Cloud, Oracle Cloud, IBM Cloud, Equinix Fabric, Coreweave, Cloudflare, and Lambda are each fighting Amazon’s dominance across multiple vectors and to various degrees.

Amazon’s custom silicon efforts - Nitro:

  • AWS worked with Cavium on developing the custom SoC on a discrete PCIe card and associated software, named “Nitro System”. It removes workloads from server CPU cores to the custom Nitro chips.
  • Annapurna Labs was acquired by Amazon in 2015. It focuses on server SOCs for networking and storage. Amazon was trying to continue its efforts on storage, and Nitro is the main enabler of a competitive advantage in these storage and databases.
  • Nitro provides services such as virtual disk to the tenant’s virtual machines and enables customers to dynamically grow and shrink high performance storage at low cost.
  • Amazon worked with Marvell to co-design the AWS Nitro SSD controller. The focus was on avoiding latency spikes and latency variability, and maximizing the lifetime of the SSD.

Other two clouds (Google and Microsoft) are years behind Amazon, both required a partner, and both were stuck with 1st or 2nd generation merchant silicon for the next few years.

James Hamilton, an engineer in Amazon, had and look at two key ways in which using AWS-designed, Arm-based CPUs could offer advantages compared to their external counterparts.

  1. Using Arm’s scale in mobile by using the arm-designed Neoverse cores or
  2. TSMC’s manufacturing scale. Both would reduce costs and offer better value to customers.

In-house CPUs enables Amazon to design CPUs to maximize density and minimize server and system level energy which helps reduce costs. The tremendous scale of Amazon especially regarding general-purpose compute and storage-related verticals will continue to drive a durable advantage in the cloud for many years.

Ultrafusion is Apple’s marketing name for using a local silicon interconnect (bridge die) to connect the two M2 Max chips in a package. The two chips are exposed as a single chip to many layers of software. M2 Ultra utilizes TSMC’s InFO-LSI packaging technology. This is a similar concept as TSMC’s CoWoS-L that is being adopted by Nvidia’s Blackwell and future accelerators down the road to make large chips. The only major difference between Apple and Nvidia’s approaches are that InFO is chip-first vs CoWoS-L is chip-last process flow, and that they are using different types of memory.

― Apple’s AI Strategy: Apple Datacenters, On-device, Cloud, And More - Semianalysis [Link]

Apple’s purchases of GPUs are minuscule and Apple is not a top 10 customer of Nvidia. The production of M2 Ultras can be consistent with the fact that Apple is using their own silicon in their own data centers for serving AI to Apple users. And Apple has expansion plans for their own data center infrastructure. Furthermore, Apple made a number of hires including Sumit Gupta who joined to lead cloud infrastructure at Apple in March.

One of the best known non-bank banks is Starbucks – “a bank dressed up as a coffee shop”. Trung Phan, rates the misperception up there alongside “McDonald’s is a real estate company dressed up as a hamburger chain” and “Harvard is a hedge fund dressed up as an institution of higher learning”.

Today, more than 60% of the company’s peak morning business in the US comes from Starbucks Rewards members who overwhelmingly order via the app. The program has 33 million users, equivalent to around one in ten American adults.

― Banks in Disguise - Net Interest [Link]

Starbucks had offered a gift card since 2001 and started to pair it with a new loyalty program “Starbucks Rewards” in 2008. Consumers are allowed to access free wifi and refillable coffee by paying with a reloadable card. The card was put onto an app in 2010 and expanded to over 9000 locations. It quickly became the largest combined mobile payments in loyalty program in the US. Uses load or reload around \(\$10\) B of value onto their cards each year and so about \(\$1.9\)B of stored card value sat on the company’s balance sheet, just like customer deposits. There are several advantages: 1) the company does not pay interest on customer funds, and 2) sweep customer funds into company’s own bank account when it concludes customers may have forgotten about them - this is called ‘breakage’. In late 2023, Starbucks was accused of making it impossible for consumers to spend down their stored value cards by only allowing funds to be added in \(\$5\) increments and requiring a \(\$10\) minimum spend. Although the company has to pay rewards to customers, it saves on merchant discount fees and receives a lots of free and valuable personal information about customers.

Delta’s SkyMiles scheme is one of the largest globally with 25 M active members. There are two ways points schemes generate money: 1) when scheme member buy a regular ticket, they also buy mileage credit they can redeem in the future, and 2) they make money from card companies (such as American Express) and other partners.

VC Says “Chaos” Coming for Startups, Ads, and Online Business as Generative AI Eats Web - Big Technology [Link]

The main point is that, as generative AI is ingested into the web, a decades-old system of online referrals and business building will be reshaped. The business model of every existing business (online travel, ecommerce, online advertising, etc) on the internet are impacted due to the transformation of online search by AI. It decreases the number of customer’s impressions on the internet thus reduce advertiser’s chance of being charged. And it also reduces the chance for startups to be observed and build brands.

OpenAI: $80 Billion - Trendline [Link]

So top 10 most valuable unicorns are 1) ByteDance, 2) SpaceX, 3) OpenAI, 4) SHEIN, 5) Stripe, 6) Databricks, 7) Revolut, 8) Fanatics, 9) Canva, 10) Epic Games

Nvidia’s four largest customers each have architectures in progress or production in different stages:

  1. Google Tensor
  2. Amazon Inferentium and Trainium
  3. Microsoft Maia
  4. Meta MTIA

― What’s all the noise in the AI basement? - AI Supremacy [Link]

Current situation of players in semiconductor industry in the context of AI competition.

Big 4 Visualized - App Economy Insights [Link]

The four titans of accounting industry - Deloitte, PwC, EY, and KPMG. They make money from 1) audit: verifying financial statement, 2) assurance: including processes, internal control, cybersecurity assessments, and fraud investigations, 3) consulting: offering advice on everything from M&A to digital transformation, especially in helping enterprise software sales, 4) risk adn tax advisory: navigating compliance, regulations, and tax laws.

Insights:

Deloitte: 1) fastest growing in revenue, 2) heavily investing in AI and digital transformation, 3) acquisition as a growth strategy.

PwC: 1) heavily investing $1B in Gen AI initiative with Microsoft, 2) will become the largest customer and 1st reseller of OpenAI’s enterprise product, 3) leader of financial services sector, serving most global banks and insurers, 4) has faced scrutiny over its audit of the failed cryptocurrency exchange FTX, raising concerns about its risk management practices.

EY: 1) invested $1.4B to create EY.ai EYQ, an AI platform and LLM, 2) abandoned its “Project Everest” plan to split it audit and consulting businesses in 2023, 3) growing business through strategic acquisitions, 4) has faced criticism for an about $2B hole in its accounts, raising concerns about its audit practices and risk management, 5) was fined $100M because hundreds of employees cheated on ethics exams.

KPMG: 1) focusing on digital transformation (data analytics, AI, and cybersecurity), 2) has faced regulatory scrutiny and fines due to audit practices, raising concerns about its audit quality and independence.

I think the release highlights something important happening in Al right now: experimentation with four kinds of models - Al models, models of use, business models, and mental models of the future. What is worth paying attention to is how all the Al giants are trying many different approaches to see what works.

This demonstrates a pattern: the most advanced generalist Al models often outperform specialized models, even in the specific domains those specialized models were designed for.

That means that if you want a model that can do a lot - reason over massive amounts of text, help you generate ideas, write in a non-robotic way - you want to use one of the three frontier models: GPT-40, Gemini 1.5, or Claude 3 Opus.

The potential gains to AI, the productivity boosts and innovation, along with the weird risks, come from the larger, less constrained models. And the benefits come from figuring out how to apply AI to your own use cases, even though that takes work. Frontier models thus have a very different approach to use cases than more constrained models. Take a look at this demo, from OpenAI, where GPT-4o (rather flirtatiously?) helps someone work through an interview, and compare it to this demo of Apple’s AI-powered Siri, helping with appointments. Radically different philosophies at work.

― What Apple’s AI Tells Us: Experimental Models⁴ - One Useful Thing [Link]

  1. Al Models: Apple does not have frontier model like Google and Microsoft do, but they have created a bunch of small models that are able to run on Al-focused chips in their products. The medium-sized model that can be called by iPhone in the cloud. The model that’s running on iPhone and the model that’s running in the cloud are as good as Mistral’s and ChatGPT.

  2. Models of Use: However, larger, less constrained models are the potential gains to Al, the productivity boosts and innovation. You would prefer to use GPT-4o to do nuanced tasks such as helping with your interviews rather than use Apple Al-powered Siri.

  3. Business Models: Apple sounds like they will start with free service as well, but may decide to charge in the future. The truth is that everyone is exploring this space, and how they make money and cover costs is still unclear (though there is a lot of money out there. People don’t trust Al companies because they are concerning about privacy. However Apple makes sure models cannot learn about your data even if it wanted to. Personal data on your iPhone are only accessed by local Al.

    And those handed to the cloud is encrypted. For those data given to OpenAl, it’s anonymous and requires explicit permission. Apple is making very ethical use of Al. Though we should still be cautious about Apple’s training data.

  4. Models of the Future: Apple and OpenAl have different goals. Apple is building narrow Al systems that can accurately answer questions about your personal data. While OpenAl is building autonomous agents that would complete complex tasks for you. In comparison, Apple has a clear and practical vision of how Al can be applied, while the future OpenAl’s AGI remains to be seen.

Elon has been spreading significant FUD by threatening to prohibit Apple devices at his companies. The truth is Apple at no point will be sending any of your data to OpenAI without explicit user permission. Even if you opt for “Use ChatGPT” for longer questions, OpenAI isn’t allowed to store your data.

According to Counterpoint Research, smartphone makers who have launched AI features on their smartphones have seen a revival in sales. Look at Samsung for example, where its S24 series grew 8% compared to S23 in 2024 with sales for its mid-range premium model growing 52% YoY. With Apple having a larger market share, along with receding expectations for an economic recession, this could be the start of a new growth chapter for the Cupertino darling once again.

Pete Huang at The Neuron explains in a step by step process of what really goes down when you ask Siri with AI a question.

  1. For almost all questions, Siri uses AI that lives on the device, aka it won’t need to hit up the cloud or ChatGPT, aka your question won’t ever leave the phone.

    • These on-device models are decent (they’re built on top of open-source models) and outperform Google’s on-device model, Gemma-7B, 70% of the time.
  2. For more complex questions like “Find the photo I took at the beach last summer,” Siri will consult a smarter AI model that runs on Apple’s servers.

    • When Siri sends your question to Apple’s servers, your data is anonymized and not stored there forever.
  3. Now, for longer questions like “Can you help me create a weekly meal plan?” or “Rewrite this email using a more casual tone,” Siri will use ChatGPT only if you give it permission to.

    • Even if you opt for “Use ChatGPT,” OpenAI isn’t allowed to store your data.

― The Real Test For Consumer’s AI Appetite Is About To Begin - The Pragmatic Optimist [Link]

Interested to know how it actually works when you ask Siri with AI a question.

5 Founder-Led Businesses - Invest in Quality [Link]

Three research findings:

  1. Founder-led businesses outpaced other companies by a wide margin. (Researched by Ruediger Fahlenbrach in 2009).
  2. Family-owned businesses ignored short-term quarterly numbers to focus on the long-term value creation, which lead to a major outperformance because of 1) lower risk-taking in the short term, and 2) greater vision and investment for the long term. (Researched by Henry McVey in 2005).
  3. Businesses managed by billionaires outperformed the market by 7% annually from 1996 to 2011. (researched by Joel Shulman in 2012).

Insights behind the findings above:

  1. Founders and owners often have their life savings invested in the shares of the business, so they have the incentive aligned.
  2. Bureaucracy reduces business performance. They will almost never make a radical shift, because politicians care more about their job title than the long term prospects of the business. Founders on the other hand are able to make radical decisions and overrule the bureaucracy, therefore they can take the business in a direction to fulfill long term vision.
  3. Founders and billionaires are exceptional people to run business.

The article listed five examples of founder-led businesses: MercadoLibre, Adyen, Fortinet, Intercontinental Exchange, and Paycom.

This move positions Apple as an AI aggregator, offering users a curated selection of the best AI tools while keeping their data private. It’s a win-win. Apple gets to enhance its user experience with powerful AI capabilities. At the same time, OpenAI gains access to Apple’s massive user base for brand recognition and potential upsell to ChatGPT Plus. There is no detail available on the exact terms of the partnership.

― Apple: AI for the Rest of Us - App Economy Insights [Link]

Integrating ChatGPT alongside Apple Intelligence features is a smart move that allows Apple to focus on their strengths (privacy, personalization) while leveraging general knowledge AI from OpenAI. This will enable Apple to blame any wrong answers and hallucinations on the LLMs the company partners with and stay out of PR trouble.

The Other Side of the Trade - The Rational Walk [Link]

An ethical implication about taking advantage of a glitch caused by software malfunction.

I have noticed the proliferation of a different type of species in academia: what I call The Failed Corporatist. This is someone who stumbles upon academia not so much out of a love for The Truth, as due to an inability to thrive in corporate settings for various other, unrelated reasons. But the Failed Corporatist has a very conventional, corporate like mindset anyway. It usually loves process, admin and adding more admin and adding more process and METRICS and social conformity. This skill set enables them to ascend the ranks of academic administration, often gaining significant influence and control over The Weird Nerd. Confronted with this altered habitat, The Nerd often finds itself in a state of distress and confusion. Its intrinsic motivation clashes with the newly imposed corporate-like order and the demand for conformity, leading to frantic efforts to assert its natural tendencies. Unfortunately, these efforts are often met with resistance or outright rejection, not only from The Failed Corporatist but also from the broader world that the academic reserve is a part of. All in all, I think this disturbance means the remaining Nerds are further driven away.

― The flight of the Weird Nerd from academia - Ruxandra’s Substack [Link]

A couple of months ago I wrote a piece called “The flight of the Weird Nerd from academia”, in which I argued there is a trend wherein Weird Nerds are being driven out of academia by the so-called Failed Corporatist phenotype. Katalin Karikó is a perfect example of a Weird Nerd. I recently argued that many Weird Nerds (I called them autistics, but people really hated that2), have found a refuge on the Internet, where their strengths are amplified and their weaknesses are less important.

And I believe the conversation here starts with accepting a simple truth, which is that Weird Nerds will have certain traits that might be less than ideal, that these traits come “in a package” with other, very good traits, and if one makes filtering or promotion based on the absence of those traits a priority, they will miss out on the positives. It means really internalizing the existence of trade-offs in human personality, in an era where accepting trade-offs is deeply unfashionable, and structuring institutions and their cultures while keeping them in mind.

Everything comes at a cost: spend more time worrying about politics, there will be less time for science. What’s more, the kind of people who really care about science or truth to the extent that Karikó did, are not the same people that get motivated by playing politics or being incredibly pleasant. There is a strong anti-correlation between these interests (that of course does not mean there is no one who is good at both.) Selecting future intellectuals based on traits like Agreeableness or Extraversion might not be only unnecessary, but actually harmful. We might be actively depleting the talent pool of the kind of people we do want to see in academic institutions.

― The Weird Nerd comes with trade-offs - Ruxandra’s Substack [Link]

The intersection of AI with the ocean of mass surveillance data that’s been building up over the past two decades is going to put truly terrible powers in the hands of an unaccountable few. - Edward Snowden

The former head of the NSA may be a great guy. But you don’t put the former head of the NSA on your board (as OpenAI just did) because he’s nice. You put him there to signal that you’re open to doing business with the IC and DoD. - Matthew Green

― Is OpenAI an AI Surveillance Tool? [Link]

OpenAI hired retired US Army general Paul M. Nakasone to its board of directors. This fact raises an issue of trust and makes people question where this leads to.

Adobe: Expanding Universe - App Economy Insights [Link]

Adobe is one of the companies potentially most disrupted by Gen AI but it turns out to be one of the fastest to capitalize on the technology. Adobe benefits the most from AI as they incorporate it in all layers of their existing stack.

Revenue has three main segments: 1) 74% digital media (creative cloud including Adobe express, document cloud including adobe acrobat), 2) 25% digital experience, 3) publishing and advertising (1%).

New AI powered product: 1) GenStudio platform is a new Gen AI powered tool aiming to streamline the entire content creation process. It’s announced at Adobe Summit 2024 in March and expected to launch in Q3 2024. It will be integrated into Adobe Experience Cloud plans or offered as a standalone product. 2) Adobe Experience Platform AI Assistant is a natural language chatbot. 3) Adobe Experience Manager is a tool to deliver right content to users at the right time. 4) Adobe Content Analytics gives access to tools to measure the marketing performance of AI created content, 4) Acrobat AI Assistant was integrated into Adobe Acrobat Reader. Others: 1) Firefly Services, 2) Adobe Express on mobile, etc.

One thing is clear: the future of fast food will be shaped by brands that can adapt to the changing landscape and evolve through savvy marketing, new menus, and the boost of tech to prepare and deliver your favorite meal.

― Fast Food Economics - App Economy Insights [Link]

Quick-service restaurant (QSR) industry is undergoing a shift. This article helps to understand how QSR giants navigates a landscape of soaring inflation, labor shortages, and ever-changing consumer tastes.

Giants:

  1. McDonald: primary as a real estate company with majority of revenue from franchised restaurants paying rent and royalties. Working on offering more compelling value deals, menu innovation, digital sales and MyMcDonald’s rewards program, growth plan of reaching 50000 restaurants globally by 2027 and doubling sales from its loyalty program.
  2. Chipotle: digital platform works well, great menu that worths growing prices, rewards program members shows an impressive loyal and boosting sales, testing its new automated digital makeline and food prep robot ‘Autocado’.
  3. YUM! (KFC, Taco Bell, Pizza Hut, Habit Burger Grill): Taco Bell is proved resilient and popular. KFC and Pizza Hut are experiencing sales decline. Digital delivers and sales are bright, proving their investment in online ordering, delivery, and AI-powered drive-thru tech are successful.
  4. Restaurant Brand International (RBI) (Tim Hortons, Burger King, Popeyes, Firehouse Subs): Burger King’s investment in store renovations, menu innovations, marketing campaigns since 2022 are paying off. The coffee and donut chain performs reliably especially in its home market of Canada. Popeyes continues strong store sales growth. The main component of RBI’s growth strategy is digital transformation.

Broadcom: AI Surge - App Economy Insights [Link]

Broadcom operates across two primary segments: 1) semiconductor solutions, which has traditionally been Broadcom’s core strength, and 2) infrastructure software, which was propelled since acquisition of VMware in 2023 Nov.

Highlights: 1) $3.1B (roughly 26% of all) in revenue is from AI products. AI alone contributed to a $2.2B revenue increase year over year. 2) Margins were down year over year significantly, primarily due to expenses related to VMware integration. 3) Broadcom has a gigantic net debt position of $62B. 4) strong cash generation - $18 B in past 12 months. 5) 10-for-1 stock split will happen in July 15. 6) it’s known for regular cadence of product introduction - Tomahawk and Jericho switching products.

AI is both a threat and an opportunity for software leaders. But for cybersecurity giants, AI means business. New technology means new threats, with Large Language Models (LLMs) dealing with vast amounts of data in the cloud and on devices.

― Cybersecurity Earnings - App Economy Insights [Link]

AI tech stack has 3 layers: 1) top: Apps or enterprise software, 2) middle: LLMs, 3) bottom: compute hardware and chips. The bottom layer (NVIDIA, AMD, ASML, TSMC) and middle layer (AWS, Azure, Google Cloud) have already benefitted from AI revenue boost. However it usually takes longer time to manifest because companies take time to adapt to new ways of optimizing processes, particularly in ERP, CRM, and BI verticals.

There is sentiment around enterprise software saying that AI would make the cost of software go to zero. But there are also some counterarguments: 1) the main expense for most software companies is not R&D but sales and marketing, 2) switching cost is high enough that even freemium software is not able to disrupting existing solutions, 3) resources needed to develop new features will decline, 4) implementing and maintaining a software solution is costly.

This article shows how some of the cybersecurity companies are navigating current environment.

Highlights:

  1. Palo Alto Networks

    Strong strength and rapid growth in Next-Gen Security (NGS) offerings. Its platformization focus and one-stop shop for security needs include cloud-delivered security services like Prisma Access (SASE), Prisma Cloud (cloud security), and Cortex (security operations). It’s facing billing issues: it’s slashing its FY24 billings guidance by $600M.

  2. CrowdStrike

    Strong Q1 performance, lower Total Cost of Ownership (TCO) due to lightweight agent and unified approach, $4B in ARR growing at over 30% YoY.

  3. Fortinet

    It specializes in network security appliances, secure SD-WAN, and operational tech security. It’s hardware-centric. And it’s currently under competitive pressure.

  4. Zscaler

    It specializes in Zero Trust solutions, a security model that assumes no user or device should be trusted by default. It has strong growth and optimistic outlook. It’s riding the wave of increasing cybersecurity demand. And it current has a rumor of Broadcom acquisition.

  5. Cloudflare

    Descipte beating earnings estimates in Q1 FY24, stock price has dropped due to concerns about its conservative guidance. Although there is short term headwind, long term growth is still promising.

AI may take longer to monetize than most expect. How long will investor optimism last? - The Pragmatic Optimist [Link]

The Dark Stain on Tesla’s Directors - Lawrence Fossi [Link]

2,596 - How to make the most out of Google’s leaked ranking factors [Link]

Ramp and the AI Opportunity [Link]

How Perplexity builds product [Link]

IBM’s Evolution of Qiskit [Link]

Articles and Blogs

Today, foundries manufacture supermajority of the chips produced in the world, and Taiwan Semiconductor Manufacturing Company (TSMC) alone has ~60% market share in the global foundry market.

Perhaps more astonishingly, TSMC has a de-facto monopoly with ~90% market share in the leading edge nodes (manufacturing processes with the smallest transistor sizes and highest densities). Leading edge nodes are crucial for applications requiring the highest computing performance like supercomputers, advanced servers, high-end PCs/laptops, smartphones, AI/machine learning, and military/defense systems. As a result, the very basic tenet of modern life is essentially standing on the shoulders of one company based in Taiwan.

― TSMC: The Most Mission-Critical Company on Earth [Link]

This is a deep dive report of Taiwan Semiconductor Manufacturing Company (TSMC).

In terms of next steps, Google has “limited the inclusion of satire and humor content” as part of “better detection mechanisms for nonsensical queries.” Additionally:

  • “We updated our systems to limit the use of user-generated content in responses that could offer misleading advice.”
  • “We added triggering restrictions for queries where AI Overviews were not proving to be as helpful.”
  • “For topics like news and health, we already have strong guardrails in place. For example, we aim to not show AI Overviews for hard news topics, where freshness and factuality are important. In the case of health, we launched additional triggering refinements to enhance our quality protections.”

― Google explains AI Overviews’ viral mistakes and updates, defends accuracy [Link]

The AI Revolution Is Already Losing Steam [Link]

It remains questions whether AI could become commoditized, whether it has potential to produce revenue and profits, and whether a new economy is actually being born.

According to Anshu Sharma, the future of AI startups (OpenAI and Anthropic) could be dim, and big tech companies (Microsoft and Google) will make profits from existing users and networks but need to spend a lot of money for a long time, which would leave the AI startups unable to compete. This is true that AI startups are already struggling right now, because at current stage AI is hard to commoditized, and it requires a lot of investments.

The improvement of AI is slowing down because they exhausted all available data on the internet. Regarding the adoption of AI in enterprise, time is required to make sure that chatbots can replace the specialized knowledge of human experts due to technical challenges.

One thing we’ve learned: the business goal must be paramount. In our work with clients, we ask them to identify their most promising business opportunities and strategies and then work backward to potential gen AI applications. Leaders must avoid the trap of pursuing tech for tech’s sake. The greatest rewards also will go to those who are not afraid to think big. As we’ve observed, the leading companies are the ones that are focusing on reimagining entire workflows with gen AI and analytical AI rather than simply seeking to embed these tools into their current ways of working. For that to be effective, leaders must be ready to manage change at every step along the way. And they should expect that change to be constant: enterprises will need to design a gen AI stack that is robust, cost-efficient, and scalable for years to come. They’ll also need to draw on leaders from throughout the organization. Realizing profit-and-loss impact from gen AI requires close partnership with HR, finance, legal, and risk to constantly readjust the resourcing strategies and productivity expectations. - Alex Singla

Although it varies by industry, roughly half of our survey respondents say they are using readily available, off-the-shelf gen AI models rather than custom-designed solutions. This is a very natural tendency in the early days of a new technology—but it’s not a sound approach as gen AI becomes more widely adopted. If you have it, your competitor probably has it as well. Organizations need to ask themselves: What is our moat? The answer, in many cases, likely will be customization. - Alexander Sukharevsky

― The state of AI in early 2024: Gen AI adoption spikes and starts to generate value - McKinsey [Link]

Industries are struggling with budgeting for Gen AI. There are some areas where investments are paying off, such as meaningful cost reductions in HR and revenue increases in supply chain management from Gen AI.

There are varies risks of Gen AI usage: data privacy, bias, intellectual property (IP) infringement, model management risks, security and incorrect use, etc. Among all, inaccuracy and intellectual property infringement ar eincreasingly considered relevant risks to organizations’ Gen AI use.

According to the three archetypes for implementing Gen AI solutions (takers, shapers, and makers), survey has found that in most industries, organizations are finding off-the-shelf offerings applicable to their business needs, about half of reported Gen AI uses publicly available models or tools with little or no customization. Respondents in energy and materials, technology, and media and telecommunications are more likely to report significant customization or tuning of publicly available models or developing their own proprietary models to address specific business needs.

The time required to put Gen AI to production for most of the respondents is around 1-4 months.

Gen AI high performers are excelling. Some common characteristics or practices: 1) They are using Gen AI in more business functions (an average of 3) compared to others average 2, 2) They are more likely to use Gen AI in marketing and sales and product or service development like others, but they are more likely than other s to use Gen AI solutions in risk, legal, and compliance; in strategy and corporate finance; and in supply chain and inventory management, 3) They are three times as likely as others to be using Gen AI in activities ranging from processing of accounting doc and risk assessment to R&D testing and pricing and promotions, 4) They are less likely to use those off-the-shelf options than to either implement significantly customized version to develop their own proprietary foundation models, 5) encountered challenges with their operating model.

Asian Americans are crucial in today’s knowledge economy: around 60% hold at least a bachelor’s degree and, despite representing only about 7% of the U.S. population, account for 50% of the workforce in leading Silicon Valley tech companies.

A detailed analysis of top Fortune 500 technology companies shows that Asian professionals are even less likely to progress in their careers today than they were a decade ago.

― Stop Overlooking the Leadership Potential of Asian Employees - Harvard Business Review [Link]

This article talked about the reasons why Asian employees’ careers stagnate, solutions for the organization to help employees move past the roadblock, and reasons of investment in Asian employees.

How to do great work [Link]

A good summary of this great article:

greatwork

Introducing Apple’s On-Device and Server Foundation Models [Link]

This article provides details about how Apple developers trained models, fine-tuned adapters for specific user needs, and evaluated model performance.

The analogy here is to Search, another service that requires astronomical investments in both technology and infrastructure; Apple has never built and will never need to build a competitive search engine, because it owns the devices on which search happens, and thus can charge Google for the privilege of making the best search engine the default on Apple devices. This is the advantage of owning the device layer, and it is such an advantageous position that Apple can derive billions of dollars of profit at essentially zero cost.

First, with regards to the title of this Article, the fact it is possible to be too early with AI features, as Microsoft seemed to be in this case, implies that not having AI features does not mean you are too late. Yes, AI features could differentiate an existing platform, but they could also diminish it. Second, Apple’s orientation towards prioritizing users over developers aligns nicely with its brand promise of privacy and security: Apple would prefer to deliver new features in an integrated fashion as a matter of course; making AI not just compelling but societally acceptable may require exactly that, which means that Apple is arriving on the AI scene just in time.

― Apple Intelligence is Right On Time - Stratechery [Link]

This article worths a read. It famously talks about Aggregation Theory as applied to the internet, and predicted much of how the Google/Facebook age unfolded through that lens. This is one main reason why winners are still old players in AI era at least for now.

How to Fund Growth (& when not to) [Link]

So You Want To Build A Browser Engine [Link]

Introducing the Property Graph Index: A Powerful New Way to Build Knowledge Graphs with LLMs [Link]

What matters most? Eight CEO priorities for 2024 - McKinsey [Link]

Gen AI’s second wave - McKinsey [Link]

Extracting Concepts from GPT-4 - OpenAI [Link]

OpenAI presents new approach to interpret concepts captured by GPT-4’s neural networks. They used sparse autoencoder to make sense of neural activity within LLMs and found 16 million features in GPT-4. Limitations are 1) hard to interpret, 2) no good way to check the validity of interpretations, 3) not all behaviors are captured, 4) challenging to scale to frontier LLMs.

YouTube and Podcasts

In an H100 GPU, every second we can move at most 3.35 terabytes of RAM in and out of memory registers. And in the same second, we can multiply 1.98 quadrillion 8bit floating point numbers. This means that it can do 591 floating point operations in the time it takes to move one byte. In the industry this is known as a 591:1 ops:byte ratio. In other words, if you are going to spend time moving an entire gigabyte around you should do at least 591 billion floating point operations. If you don’t, you are just wasting GPU and potential compute. But if you do more than that, you are just waiting around on memory bandwidth to get your data in there. In our models, the amount of memory we need to move around is relatively fixed, it’s roughly the size of our model. This means that we do have some control over on how much math that we can do by changing our batch size.

In reality, we’ve discovered that bottleneck can arise from everywhere from memory bandwidth, network bandwidth between GPUs, between nodes, and other areas. Furthermore the location of those bottlenecks will change dramatically on the model size, architecture, and usage patterns.

― Behind the scenes scaling ChatGPT - Evan Morikawa at LeadDev West Coast 2023 [Link]

This is a behind the scenes look at how OpenAI scaled ChatGPT and the OpenAI APIs. But also a very good talk to show how hard it is to scale infrastructure for model architecture etc, and how important it is to master these skills and knowledge in chip manufacture and design industry and in LLM development industry. The talk covers 1) GPU RAM and KV Cache, 2) batch size and ops:bytes, 3) scheduling in dozens of clusters, 4) autoscaling (and the lack thereof).

Key facts to consider when developing metrics for compute optimization and model scaling:

  1. GPU memory is valuable. But it is frequently a bottleneck, not necessarily compute.
  2. Cache misses are non linear on compute, because we suddenly need to start recomputing all stuff.

When scaling ChatGPT, we need to:

  1. Look at KV cache utilization and maximize all the GPU RAM we have, and
  2. Monitor batch size - the number of concurrent requests we run to the GPU at the same time, to ensure the GPUs are fully saturated. These are two main metrics used to determine how loaded our servers were.

In reality, there are more bottlenecks (memory bandwidth, network bandwidth between GPUs, between nodes, and other areas) and the location where they arise can change according to the model size, architecture, and usage patterns. The variability has made it very hard for AI model developer and chip manufactures to design chips to get that balance right. Future ML architectures and sizes have been very difficult to predict. But overall we need to be tweaking this math as the models evolve.

The third challenge is to find enough GPUs. Note that the time of a response is dominated by the GPU streaming out one token at a time, as a result, it’s been more important to just get capacity and optimized a well balanced fleet, over putting things geographically close to users.

The fourth challenge is the inability to scale up this fleet. OpenAI has delayed some feature of ChatGPT due to the limitation of compute resources.

Some lessons they have learned in solving GPU challenges:

  1. It’s important to treat this as a system engineering challenge as opposed to a pure research project.
  2. It’s important to adaptively factor in the novel constraints of these systems.
  3. Every time model architecture shifts, a new inference idea is proposed or a product decision is changed, we need to adapt and rerun a lot of this math again. Diving really deep has been important. This low level of implementation details matter.

The final challenge is abuse on the system and AI safety challenges.

For many years, particularly following the original SARS pandemic, there was a lot of conversations around how do we get in front of the next pandemic, how do we figure out what’s coming and how do we prepare for it. And there was a lot of research that was launched to try and resolve that key question. It’s like does the effort to try and stop the problem cause the problem. I think that from my point of view there is a very high probability that there was some leak that meant that the work that was going on to try and get in front of the next pandemic and understand what we could do to prepare ourselves, and what vaccines could be developed and so on, actually led to the pandemic. So then when that happens how do you respond when you are sitting in that seat, that’s the key question that I think this committee is uncovering. - David Friedberg

― Trump verdict, COVID Cover-up, Crypto Corner, Salesforce drops 20%, AI correction? - All-in Podcast [Link]

The TED AI Show: What really went down at OpenAI and the future of regulation w/ Helen Toner [Link]

In the interview, Toner revealed that the reason of firing Altman is his psychological abuse and being manipulative in different situation. Looking at Altman’s track record prior to OpenAI, it seems those are not new problems of Sam.

How Walt Mossberg Built Relationships With Jobs, Gates, and Bezos - Big Technology Podcast [Link]

Nvidia’s 2024 Computex Keynote: Everything Revealed in 15 Minutes [Link]

What really works when it comes to digital and AI transformations? - McKinsey [Link]

… but what I do take offense at is labeling millions and millions of ordinary Americans as somehow lacking in empathy, lacking in caring, not being a good parents, because you don’t like their support for Trump. And I think that that is a statement that frankly reeks of being cocooned in an elite bubble for way too long. Let me just explain. If you look at where Trump’s support is strongest. It’s really in Middle America and sort of the heartland of America, basically the part of America that the Coastal at least dismissively refer to as flyover country. It’s a lot of the industrial midwest and frankly that part of the country had not had the same type of economic experience that we’ve had in Silicon Valley. They have not been beneficiaries of globalization. If you are in a handful of export industries in America and I’m talking about if you are in Hollywood or you are in Big Finance or you are in Software, then globalization has been great for you, because it has created huge global markets for our products. However, if you are in an industry that has to compete with global exports, then it’s been very bad with you and blue collar workers have been hurt, labor’s been hurt, people who work with their hands have been hurt. They have not benefitted in the same way from the system we’ve had in this country for the last 30 years. So you can understand why they would not be so enchanted with elite thinking. I think to then label those people as lacking in caring or empathy or not being good parents because they haven’t had the same economic ride that you had for the last 30 years and then you are the one who is fighting a legal battle to kick some of those people off the public beach in front of your beach house, and then you are saying they are the ones lacking in empathy, dude, look in the mirror. - David Sacks

This is the future of how smart reasonable moderate people should make decisions. It is an example. Talking to somebody you disagree with does not make your opinion bastardized, it actually makes your opinion valuable. There are these simple truths to living a productive live that if you want to embrace, you need to find friends that you can trust, even on issues when you disagree, you can hear them out. - Chamath Palihapitiya

There is no law that defines why you should or shouldn’t buy a security, with respect to the diligence you have individually done, to determine whether the underlying business is worth the price you are paying. The law says, that the business that are listing their securities for public trading have an obligation to make disclosures on their financials and any other material events to the public and they do that through the SEC filing process. That’s all out there. And then what you as an individual will do with it is up you to. - David Friedberg.

― DOJ targets Nvidia, Meme stock comeback, Trump fundraiser in SF, Apple/OpenAI, Texas stock market - All-in Podcast [Link]

Deloitte’s Pixel: A Case Study on How to Innovate from Within - HBR On Leadership Podcast [Link]

WWDC 2024 — June 10 | Apple [Link] [Short Version]

Apple’s promising updates on visionOS, iOS, Audio&Home, watchOS, iPadOS, and macOS.

Let’s reproduce GPT-2 (124M) [Link]

This four hours video guides you to create a fully functional GPT-2 model from scratch. It includes details about model construction, speed optimization, hyperparameter setup, model evaluation, and training.

Building open source LLM agents with Llama 3 [Link]

LangChain and Meta uploads new recipes/tutorials to build agents that runs locally using LangGraph and Llama 3.

Apple took a shortcut to get here, they partnered with open ai. And this is something that I don’t think they’ve ever really done before at the operating system level. Apple is famous for being vertically integrated, for being a walled garden, for being end to end. They control everything from the chips to the hardware to the operating system, and they don’t let anybody else in, until you are at the App Store Layer. This is allowing somebody in beneath the level of the App Store. This is allowing someone OpenAI to get access to your data, and to control your apps, at the operating system level. Elon pointed out wait a sec what are the privacy implications here. And I think there are major privacy implications. There is simply no way that you are going to allow an AI on your phone to take. Remember Apple in the past has been the advocate for consumer privacy. There is a whole issue of the San Bernardino terrorist where the FBI went to Apple and said we want you to give us back door access to their phone and Apple refused to do it and went to court to defend user privacy. And furthermore, one of Apple’s defenses to the antitrust arguments for allowing sideloading and allowing other apps to get access to parts of the operating system, they’ve always said we can’t do this because it would jeopardize user privacy and user security. Well here they are opening themselves up to OpenAI in a very deep and fundamental way in order to accelerate the development of these features… I think this is going to open Pandora’s box for Apple, because again they’ve proven that they can open up the operating system to a third party now, and who knows what the privacy implications of this are going to be. - David Sacks

I think there are three numbers that matter: the inflation rate, the growth in GDP, and the cost to borrow. The growth in GDP in the first quarter of 2024 was a lousy - 1.3% on the annualized basis. And even if the rate of inflation came down, we are still inflating the cost of everything by north of 3%. So the economy is only growing by 1.3% and it costs more than 3% each year to buy stuff. So that means everyone’s spending power is reducing, and our government’s ability to tax is declining, because the economy is only growing by 1.3%. And the most important fact is that the interest rates are still between 4-5% (4.7%). That means that borrowing money costs 4.7%, but the business the economy on average is only growing 1.3%. So just think about that for a second. We have tremendous amount of leverage on businesses on economy on the federal government. That leverage, the cost to pay for that debt is more than 4-5% but you are only growing your revenue by 1.3%. So at some point you cannot make your payments. That is true for consumers, it’s true for enterprises, and it’s true for federal government. The whole purpose of raising rates is to slow the flow of money through the economy. And by slowing the flow of money through the economy, there is less spending which means that you are reducing the demand relative to the supplies, so the cost of things should come down, you should reduce the rate of increasing in the cost of things… There is certainly a shift in the market because what this tells us is that the timeline at which the fed will cut rates is coming is a little bit. So the market is saying okay let’s adjust to lower rates, the 10 year treasury yield has come down a little bit, but we are still in a difficult situation for people, and for businesses. - David Friedberg

If the revenue of everything combined which is GDP isn’t going faster than the increase in the cost of everything, people, businesses, and government can’t afford their stuff. And that’f fundamentally what is going on right now. What we need to see is a normalization where GDP growth is greater than inflation rate. And as soon as that happens then we have a more normalized and stable economy. So right now things are not stable. There is a lot of difficulty and strain in the system. - David Friedberg

You had 1.3% GDP growth rate with a 6% of GDP deficit by the government. If the government wasn’t printing so much money, wasn’t over spending, and you were to have a balanced budget, it would be a recession. It would be negative GDP growth if not for the government’s program stimulating the economy. And a lot of jobs you are talking about are government jobs. The government is creating jobs like crazy, not in the private sector but in the public sector, because it is an election year. So there is a lot of political forces proping things up, and I wonder what happens after the election. - David Sacks

― Elon gets paid, Apple’s AI pop, OpenAI revenue rip, Macro debate & Inside Trump Fundraiser - All-in Podcast [Link]

Energy is high at the beginning with a blackjack! Went through several news and topics: 1) Elon’s comp package approved by shareholders, besties criticized some reneging people, who are really not good ones to do business with, 2) Apple announces “Apple Intelligence” and ChatGPT deal at WWDC, first time of opening up OS to the third party, raising data privacy concerns, 3) OpenAI reportedly hit a $3.4B revenue run rate, 4) state of US economy.

Leopold Aschenbrenner - 2027 AGI, China/US Super-Intelligence Race, & The Return of History [Link]

An Interview with Leopold Aschenbrenner. Refering to his 165 page essay about AI safety: https://situational-awareness.ai/.

Private Cloud Compute: A new frontier for AI privacy in the cloud - Apple Security Research [Link]

What I learned from all of that, if I look at his mistakes and successes, I learned a couple things. The first is most of the money he’s made by holding onto things, not the momentum trading that he was known for in, in public equities. And number two, he made most of his money buying quality, like quality was very important to his success. And that’s something I’ve taken from him. And I truly believe, like the environment in which you grow up dictates the kind of person and mentality you will have in life. So let me just quickly expand on that. If you grow up during the Great Depression, I would presume you’re focused on saving every penny and looking for cheap. And I think I’m just taking a guess that you’re gonna be much more focused on buying cigar butts in life than you might be on buying the highest quality asset you can find and maybe paying up for. Somehow my father figured out that the real money is in the best businesses and the higher quality assets. And he instilled that in me very early. And sometimes these things look expensive. And so my point earlier is that he taught me very early not to dismiss something that might look expensive on the surface before you do the deep work and really understand what you’re buying and what it’s worth.

And the problem that a lot of value investors have, I think, is that they all screen for low p ratios, high dividend yields, you know, low ev to cash flows, what whatever it is. They’re screening and they will miss because the kind of any value in that area is gonna get competed away. And so I’m more interested in businesses that might look expensive on the surface, but actually aren’t, and you have to be careful because a lot actually are expensive, right? And there’s no margin safety there. But there is Peter Kaufman said there’s margin where there’s mystery. I think that’s so true. There’s margin where there’s mystery. And so sometimes the best investments are those that are misunderstood and might appear expensive, and they’re often quality kinds of businesses. So I gravitate toward quality partly as a result of that influence he had on me, if that makes sense.

Tesla company is definitely one of the most misunderstood companies that I’ve seen in my 25 years or so of managing money for others. It’s such an interesting company and it, and it’s so misunderstood and I think it’s misunderstood for a few reasons. One, you have this kind of overarching personality where people start to formulate opinions based on what Elon is. And people think about it as a car company. And third, most people haven’t actually dived in, right? They’ve not spent the hundreds and hundreds of hours on this company and they haven’t really gone through the financials to understand the economics of the business. So there are a few things coming together where I think that this company still remains misunderstood, but there’s a reason why it’s up whatever 15000% or so since the IPO. And there is a reason why it continues to go higher over time because there are plenty of people that do get it right and that it’s just getting more and more concentrated into what I would probably call hands of smart money.

I’d say it’s not just a car company, it is very much an EV company, but it’s not just a car company… Why would the future look any different? And of course it, it it did. Why were people skeptical? While there were no paved roads, supply chains were very limited. There were very few fueling stations, there was very little manufacturing capacity. Kind of sound familiar right to today in EV terms. But ice vehicles, Henry Ford disrupted the horse and carriage very quickly within 20 years, which is happened to be pretty much the timeframe during which curves take formation is a 20 year disruption period with respect to pretty much all these transformational technologies.

Going back to the guttenberg printing press and the steam engine and the spinning wheel, it’s all about 20 years. So my point is that there’s all of this skepticism and you had horse and carriage competing against this noisy ice vehicle. Both were forms of transportation, both gotten you from point A to point B, but one was fundamentally different. It was fundamentally different because it was a much more efficient process of getting you from point A to point B. And that is the lens from a kind of very high top down level that I look at Tesla and electric vehicles in general, they’re just a much more efficient way to get you from point A to point B than ice vehicles. And what I mean by that is the cost of ownership and cost per mile is just much lower. And so then it’s a question of what are the risks? What are the competitive advantage does Tesla have over the rest of the competition in EV and how will Tesla survive and thrive against ice vehicles? Which to me are going the way of the dinosaur. That is a big assumption that I believe is true because I think that EV adoption is following the traditional S curve adoption phase and there’s not a lot of time left for ice vehicles to exist.

The real voyage of discovery is not in seeking new landscapes, but in having new eyes.

I gave you the first kind of like lens at which I’m looking at Tesla as competing as an EV company against ice vehicles and ev as a whole being much more efficient than ice. But the other lens I should share is that I don’t think you can understand this company if you don’t understand. I could be totally wrong, but as assuming that I’m right, I don’t believe you can understand the company if you don’t understand that - to me it’s an advanced electronics manufacturer and software company competing against a traditional automobile manufacturing company.

This is super important. It’s an electronics slash software company competing against traditional auto. It’s super important to understand that because there’s certain things that kind of like come into play. There was an aeronautical engineer by the name of Theodore Wright, he devised this concept called the rights law, which states that for every doubling of cumulative production, that costs fall by a constant percentage. And when you understand that Tesla is an electronic software company, you understand like where is this company along this rights law curve.

Tesla is so much further along the curve than any of the ice vehicles that are constrained because they’re not electronic in software companies, they’re traditional auto companies. And it’s Tesla’s so much further along the curve with respect to other EV companies. And so as ice vehicles traditional auto catches up, Tesla just moves much further along the curve. And so the spread between the competitive advantage of Tesla, the other EV companies and traditional auto is actually widening. It’s not getting more narrow, it’s widening because of its massive scale, which allows it to push itself out along the, the cost curve further than anybody else.

We actually did buy more around current prices and these arguments, most of them at least except for the 50 billion compensation package, most of these arguments sound like the same arguments that you could go back and read since the company went public. They’re probably less negative articles today than there were around 2011, 2012, 2013. But they seem very similar. And yet the stock continues to go higher up 15,000 or so percent since it’s IPO. And what’s the case with pretty much every growth company from Amazon to Microsoft? Any great growth company, there are always periods when the stock is not going up. There have been so many massive drawdowns in Tesla since it’s IPO, like there’s gotta be a couple dozen, at least 40% drawdowns since it’s IPO or at least 30% drawdowns. And that’s just part of investing in growth companies. No businesses. And I wrote a paper called Power and Challenges of Compounding, which is on our website, but growth companies just don’t go in straight lines. They move more like in a step formation. And if you look at the kind of longer term chart of Tesla, Tesla’s just kind of in a step formation just like Amazon was and Microsoft was.

I totally focus on the business fundamentals. I really don’t, I don’t let the stock prices dictate like what I’m thinking about the business. And when I’m looking at a business, a potential company to buy, I try my very best like not to pay any attention to the stock price and just come up with my own idea of what I think the business is worth. And so the key is really to focus on the fundamentals. And if the fundamentals are moving in the right direction, then the stock price will take care of itself over time. The the problem is that if you have all your money in one or even five companies or maybe even 10 companies, it’s really, really hard to deal with that emotionally.

Mr. Rogers was not just a TV host who was the central figure for Mr. Rogers neighborhood, but he was also a Presbyterian minister. And I think that that upbringing education and that way of life influenced how he dealt with people. And he had this wonderful expression. He said that there are three ways to ultimate success. The first is to be kind, the second is to be kind, and the third is to be kind. I thought that was really interesting and very powerful and it meshed with how I wanted to try to live life. And it also me meshes with, you know, this idea of reciprocity, which is deep rooted human condition, right? If we give kindness to the world and bringing kindness back, I truly believe that. And, and so he was, and has been influential to how I think about life and how I try my best. I don’t always succeed, but I do try my best to live my life according to Mr. Rogers’ values.

Peter said that, look, you need to look at your life as one ladder. And there’s seven steps to the ladder. These seven steps are pretty much in this order. Health and then family and friends, career, community, spirituality and hobbies. The most important, those of those seven steps is health. Because health is multiplicative, right? If you take health and you multiply it by zero, everything else goes to zero, not good. So you wanna focus on that as first and foremost.

― Real Success w/Christopher Tsai - We Study Billionaires Podcast [Link]

In this episode Christopher talked about his family history, investment in Tesla, Microsoft, Visa, and Mastercard, and some other personal development and investment tips. This is what I’ve learned:

  1. It completely changed my mind in understanding Tesla. I thought Tesla’s main business is traditional automobile, while EV is a new leading branch in the frontier. But it turns out that it’s wrong. Tesla is actually an electronic software company with a smart undercover of traditional automobile. As traditional auto vehicle companies catching up, Tesla is just moving further and further. Also being aware of what’s happening around Tesla: stock price was falling, sales growth is slowing, BMW and BYD are flooding the market with EV, unfocused CEO Elon Musk, no new models of the car since 2020, etc, I was already starting questioning Tesla, but in fact, the truth is Elon beat almost every milestone, he kept almost every promise, and he deserves his $50B pay package. What Tesla is current going through is as normal as what any other growth companies have experienced.
  2. The key of being a value investor and picking valuable stocks is to focus on the business fundamentals but not stock price.
  3. Diversifying portfolio is a good way to deal with pain and emotions, and stop you from interrupting the compounding process.

In conversation with President Trump - All-In Podcast [Link]

They brought President Donald Trump to the show! They asked great questions and they have 40 min high-quality recap, it’s very impressive. Btw, I do feel President Trump is a really engaging person - you can feel it by just listening to him speaking out.

Data + AI Summit Keynote Day 1 - Full - Databricks [Link]

Experts, researchers and open source contributors — from Databricks and across the data and AI community gathered in San Francisco June 10 - 13, 2024 to discuss the latest technologies in data management, data warehousing, data governance, generative AI for the enterprise, and data in the era of AI.

Notes of “Ali Ghodsi, Co-founder and CEO,Databricks”:

There are three problem from AI practitioners:

  1. Everyone wants AI

    Organization don’t care about MMLU performance, they care about using model to do well on their data for their use cases and businesses. According to a survey, 85% of the use cases have not made it into production. This indicates that getting AI on your data into production is hard - people want high quality, low cost, and privacy.

  2. Security and privacy are under pressure

    It’s under intense pressure. People care about AI regulation, data privacy, and cyberattacks.

  3. Data estate is fragmented.

    Lots of complexity, huge costs, and proprietary lock-in.

Databricks solution to these three problems is data intelligence platform. The idea is: don’t give your data to vendors, instead, own your own data, store them in data lake in a format that’s standard. These are what Databricks is doing:

  1. They acquired Tabular because they want the data to be stored in a standard format so that every engine can access to.
  2. They also launched project UniForm, which aims to make sure that UniForm has 100% full compatibility and interoperability for both of Delta Lake and Iceberg projects.
  3. Unity Catalog allows you to do governance to ensure access, control, and security, and also discovery, lineage, auditing, data and model quality monitoring. And They have just open-sourced Unity Catalog.
  4. Data in data lake combining with Mosaic AI’s AI models is called Data Intelligence Platform. This platform trains Gen AI models on your data in isolation for each customer, and leverages that throughout the platform for everything it does.
  5. Data Intelligence is democratized data plus democratized AI. Democratized data means everyone in your organization should be able to access to the data directly. They, including those don’t know how to speak sql, should be able to access to data or get insights from data by spending languages. Democratized AI means everyone should be able to create AI models that understand your data in your organization.
  6. All the Databricks now are available in 100% serverless.

Note of “Brian Ames, General Motors”:

Their mission is zero crashes, zero emissions, and zero congestion. In order for GM to be part of the future, they need to become a software company. They started from building data silos, on-prem infrastructure, and keeping pace of innovation. Their vision and strategy are to change the culture, move to the cloud, create a data insights factory cloud, and build upon Databricks.

The GMs data insights factory today includes single source of truth (where big data are ingested), trusted data (with functions of GenAI platform, ETL and orchestration, and data warehousing), open ecosystem (with Meta’s LLMs, AI models, and data governance ), and react front end. Majority of the factory is supported by Databricks.

Morgan Housel: Get Rich, Stay Rich - The Knowledge Project [Link]

The Founder of Rolex: Hans Wilsdorf - Founders [Link] [Transcript]

China’s AI Journey - Weighty Thoughts [Link]

E156|自动驾驶领域的GPT时刻来了?聊聊特斯拉V12、FSD入华与RoboTaxi - The Silicon Valleyer with Jane [Link]

Good discussion about Tesla V12, FSD in China, and RoboTaxi. I learnt new views about end-to-end technology, Musk’s vision, overview of EV market, EV competitors, etc.

It makes so much sense for them so I think they should do it as quickly as possible. We are in the first inning of what should probably be an enormous tectonic shift in technology. And I think if whoever wins in the first inning usually isn’t the one that’s winning by the ninth inning. And so I would encourage anybody that’s winning right now to monetize, get secondaries, take money off the table as fast as possible. Because the future is unknown and the more disruptive the technology is, the more entropy there is, which means that there is going to be more changes not less. And again I would just look at search as an example, I would look at social networking as an example. When you look 20 years later the people who captured all the value were not the one that at the beginning who everybody thought was going to win. And so I think If it plays out similarly, it’s important for the people that are in the lead today, to recognize that it’s too early, and they should monetize their perceived success as quickly as they can to the largest magnitude as possible. - Chamath Palihapitiya

I think OpenAI is running a very strategic game plan to become part of the tech establishment as quickly as they can, so that they are in the inside looking out as opposed to the outside looking in. They were able to add the former head of the NSA to their board of directors. It’s how you become part of the establishment. Do you think the former head of the NSA no long has a security clearance or knows people in the NSA? No of course not. And I think that there is a group of people that want to make sure that these kinds of technologies and capabilities are firmly within the hands of the US apparatus and not anybody else. And so I think that that pulls them closer to the kinds of folks that could otherwise give them a hard time or regulate them, etc. So now what happens is when you have senate hearing about this stuff it’s more likely that it’s confidential behind closed door, it’s under the purview of National Security. All these things are beneficial to OpenAI. And secondly they were able to get Elon to drop his lawsuit. So the next logical step is now to create Capital Market distribution, which is really about syndicating ownership of the company to all the big deep pools of money, so that they are also rowing in the same direction in support of OpenAI. That’s what a lot of people don’t get, it’s not about valuations or this and that, this is about creating a highlevel game theory of how to create an international apparatus that supports your corporate objectives. There are a few companies that have done this well, and they are now one of them, the only thing left is to get shares into the hands of the BlackRocks, the t-rows, all the big mutual fund apparatuses of the world that then syndicate to all the individual investors of the world. You have everything, you have government connections, you have no real legal overhang, then the likelihood that an IRS agent all of a sudden decides to audit OpenAI is basically zero. It’s a smart business strategy. - Chamath Palihapitiya

“Microsoft excels with bundling. It’s their not so secret weapon for dominating new markets. We know the playboo: Office + Teams, Windows + Explorer, Azure + Visual Studio, 365 + OneDrive, & Xbox + Game Pass. - Marc Benioff @X

― Presidential Debate Reaction, Biden Hot Swap?, Tech unemployment, OpenAI considers for-profit & more [Link]

Statistical Learning Course - Stanford Online [Link]

Last time of watching this series of lectures it was 3 years ago. Happy to see these two old guys again (Trevor Hastie and Robert Tibshirani). I’m planning to review the whole series in my spare time.

Machine Learning Course - CS 156 Caltech [Link]

This is an unusual (to me) ML course with a strong focus on theory. It’s taught from a very different perspective that’s supplementary to what I have learned from my ML courses. Definitely going to watch it once I have time.

EfficientML.ai Lecture, Fall 2023, MIT 6.5940 - MIT HAN Lab [Link] [Website]

I have to watch this. Feed me knowledge please! ヾ(◍°∇°◍)ノ゙

Papers and Reports

SimPO: Simple Preference Optimization with a Reference-Free Reward [Link]

Deep Learning Interviews: Hundreds of fully solved job interview questions from a wide range of key topics in AI [Link]

Best preparation book for AI/ML job seekers and students.

The economic potential of generative AI: The next productivity frontier - McKinsey [Link]

Microsoft New Future of Work Report 2023 [Link]

The Prompt Report: A Systematic Survey of Prompting Techniques [Link]

Situational Awareness: The Decade Ahead [Link]

Better & Faster Large Language Models via Multi-token Prediction [Link]

How Can Recommender Systems Benefit from Large Language Models: A Survey [Link]

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [Link]

Efficient data generation for source-grounded information-seeking dialogs: A use case for meeting transcripts - Google Research [Link]

This open source dataset “Meeting Information Seeking Dialogs” is unique with an aim of improving interaction with meeting recordings through conversational AI models. It allows users to query and engage with transcript content efficiently through a developed agent.

Meta Large Language Model Compiler: Foundation Models of Compiler Optimization - Meta Systems Research [Link]

Meta LLM Compiler is a family of models built on Meta Code Llama with additional code optimization and compiler capabilities.

ESM3: Simulating 500 million years of evolution with a language model - EvolutionaryScale Research [Link]

ESM3 is an AI model capable of understanding and predicting the sequence, structure and function of proteins, simulating evolutionary processes, and generating new proteins with specific traits.

Github

Tool Use - Anthropic [Cookbook] [Notebook]

Anthropic has launched a new feature for its AI assistant, Claude, known as “Tool Use” or “function calling.”

All Machine Learning Algorithms Implemented in Python / Numpy [Link]

Spreadsheet Is All You Need [Link]

GPT architecture is recreated in spreadsheet.

Hello Qwen2 [Link]

Alibaba released new open-source LLM called Qwen2, which outperforms Meta’s Llama 3 in specialized tasks. It’s accessible via HuggingFace, with weights available and five model sizes (0.5B, 1.5B, 7B, 57B-14B (MoE), and 72B). Qwen2 has been trained on data in 29 languages, and can handle up to 128K tokens in context length. It has been benchmarked against Meta’s Llama 3 and OpenAI’s GPT-4, achieving top scores. The primary innovation of Qwen2 is its long-context understanding.

Cohere Cookbooks [Link]

A set of tutorials for building agents / AI applications.

Mistral Cookbooks [Link]

Introducing llama-agents: A Powerful Framework for Building Production Multi-Agent AI Systems - LlamaIndex [Link]

News

Amazon to expand drone delivery service after clearing FAA hurdle [Link]

Amazon’s drone delivery services “Prime Air” was laid out more than a decade ago but has struggled since then. In 2022, Amazon said it would begin testing deliveries in College Station, Texas. In 2023, Prime Air was hit by layoffs. But recently Amazon said it would expand drone operations to Phoenix, Arizona, etc. And it’s expected to expand to other cities in 2025.

Salesforce’s stock suffers its biggest drop in two decades [Link]

NASA’s James Webb Space Telescope Finds Most Distant Known Galaxy [Link]

Saudi fund joins $400m funding round of Chinese AI startup Zhipu [Link]

A PR disaster: Microsoft has lost trust with its users, and Windows Recall is the straw that broke the camel’s back [Link]

As Microsoft has done a lot of things (obtrusive ads, full-screen popups, ignoring app defaults, forcing Microsoft Accounts, etc) to degrade the Windows user experience over the last few years, it lost the trust relationship between Windows users and Microsoft, therefore a tool like Recall is described as literal spyware or malware by users no matter how well you communicate the features to the world.

Apple Made Once-Unlikely Deal With Sam Altman to Catch Up in AI [Link]

The deal between Apple and OpenAI will give OpenAI access to hundreds of millions of Apple users, and bring Apple the hottest technology of the AI era - that can pair with its own services.

Nvidia is now more valuable than Apple at $3.01 trillion [Link]

Mark this today, on Jun 5, 2024, Nvidia’s market cap is higher than Apple becomes the second most valuable company in the world.

SpaceX’s Starship Rocket Successfully Completes 1st Return From Space [Link]

Woman Declared Dead Is Found Alive at Funeral Home [Link]

BYD Launches Hybrids With 1,300-Mile Driving Range [Link]

China’s plan to dominate EV sales around the world [Link]

Nvidia emails: Elon Musk diverting Tesla GPUs to his other companies [Link]

How Apple Fell Behind in the AI Arms Race [Link]

Next week, at Apple’s annual Worldwide Developers Conference, the company is set to join an AI arms race - announce an array of generative AI upgrades to its software products, including Siri.

Tesla’s $450 lightning-shaped bottle of mezcal is its most expensive liquor yet [Link]

Apple’s Upcoming AI Reveal, Pika Labs Raises $80 Million, Twelve Labs, $50 Million [Link]

Among the biggest spenders on sovereign AI is Singapore, whose national supercomputing center is being upgraded with Nvidia’s latest AI chips and where state-owned telecom Singtel is pushing an expansion of its data center footprint in Southeast Asia in collaboration with Nvidia. The country is also spearheading a large language model that is trained on Southeast Asian languages.

Other big projects are taking place in Canada, which last month pledged $1.5 billion as part of a sovereign computing strategy for the country’s startups and researchers, and Japan, which said it is investing about $740 million to build up domestic AI computing power this year following a visit from Huang.

Similar pushes are spreading across Europe, including those in France and Italy, where telecom companies are building AI supercomputers with Nvidia’s chips to develop local-language large language models. French President Emmanuel Macron last month called on Europe to create public-private partnerships to buy more graphics processing units, or the core chips used to train AI, to push its share of those deployed globally from 3% currently to 20% by 2030 or 2035.

― Nvidia’s New Sales Booster: The Global Push for National AI Champions [Link]

Cloud-computing giants and big tech companies have been a great source of revenue for NVIDIA, now Sovereign Al is another lever. Governments demand sovereign clouds for their AI infrastructure and sensitive data, and US tech companies such as NVIDIA are eager to build those for them. Question would be how long can they keep this momentum in generating high revenue.

Do you best creating, thinking, learning, brainstorming, note-taking - Google NotebookLM [Link]

Google upgraded its NotebookLM powered by Gemini 1.5.

There’s one other way Apple is dealing with privacy concerns: making it someone else’s problem. Apple’s revamped Siri can send some queries to ChatGPT in the cloud, but only with permission after you ask some really tough questions. That process shifts the privacy question into the hands of OpenAI, which has its own policies, and the user, who has to agree to offload their query. In an interview with Marques Brownlee, Apple CEO Tim Cook said that ChatGPT would be called on for requests involving “world knowledge” that are “out of domain of personal context.”

Apple’s local and cloud split approach for Apple Intelligence isn’t totally novel. Google has a Gemini Nano model that can work locally on Android devices alongside its Pro and Flash models that process on the cloud. Meanwhile, Microsoft Copilot Plus PCs can process AI requests locally while the company continues to lean on its deal with OpenAI and also build its own in-house MAI-1 model. None of Apple’s rivals, however, have so thoroughly emphasized their privacy commitments in comparison.

― Here’s how Apple’s AI model tries to keep your data private [Link]

Introducing Apple Intelligence, the personal intelligence system that puts powerful generative models at the core of iPhone, iPad, and Mac [Link]

Ilya Sutskever Has a New Plan for Safe Superintelligence [Link]

AI Employees Should Have a “Right To Warn” About Looming Trouble - Big Technology [Link]

Ilya left OpenAI in mid-May and started Safe Superintelligence Inc. in mid-Jun, with a goal of creating a safe powerful AI system. Daniel Gross (former Apple Inc. AI lead) and Daniel Levy (former AI engineer at OpenAI).

To me, it’s not a bad thing to let OpenAI safety employees leave and start their own business. It’s actually a good thing for both. OpenAI led by Sam is eager to stay the leading AI company and develop AGI as quickly as possible. Having colleagues concerning about AI safety would only slow down the progress. So the goals of OpenAI vision & mission and OpenAI safety team don’t align. Not saying AI safety is not important. I’m saying the AI pioneers and AI safety team should both exist individually and separately so that they are restricting each other in an official, public, and even way, and not conflicting each other from inside.

Musk’s xAI supercomputer will get server racks from Dell and Super Micro [Link]

Dell and Super Micro Computer will provide server racks for the supercomputer being developed by xAI. The supercomputer aims to power the next iteration of xAI’s chatbot Grok, which requires a vast number of Nvidia GPUs for training. Musk plans to have the supercomputer operational by fall 2025.

Releasing New AI Research Models to Accelerate Innovation at Scale - Meta News [Link]

  • Meta Chameleon: 7B & 34B language models

    Open source model is catching up GPT-4o. This is the first open source base model that is able to take multimodal inputs and generate outputs. Unfortunately it currently only has a research license.

  • Meta Multi-Token Prediction LLM

    Meta released a language model for code completion using multi-token prediction. The approach was newly proposed in a paper aiming to build better and faster LLMs by using multi-token prediction. [Paper]

  • Meta JASCO: text-to-music models

    They released generative text to music models able to accept various conditioning inputs for greater controllability.

  • Meta AudioSeal: audio watermarking model

    This is the first designed specifically for the localized detection of AI-generated speech, available under a commercial license. This would be very useful to detect deep fakes.

  • Additional RAI artifacts

    To ensure geographical and cultural diversity in the capability of text to image models, they developed automatic indicators to evaluate potential geographical disparities in text to image models. And now they released this evaluation code and annotations. [Paper]

Claude 3.5 Sonnet - Anthropic News [Link]

On Claude 3.5 Sonnet - Zvi Mowshowitz [Link]

Claude 3.5 Sonnet achieves higher performance in various key metrics and tasks, outperforms competitor models as well, and performs at twice the speed of Claude 3 Opus and at 1/5 the cost. Also, Anthropic introduced a new feature called Artifacts on Claude.ai to expand how users can interact with Claude.

AI tools are coming to Gmail, Google Drive, and Firefox [Link]

Google is integrating AI side panels powered by Gemini into Gmail, Docs, Sheets, Slides, and Google Drive, enhancing writing assistance, summarization, and content creation.

Firefox starts letting you use AI chatbots in the sidebar [Link]

Mozilla is incorporating AI chatbots into Firefox. ChatGPT, Google Gemini, HuggingChat, or Le Chat Mistral are options for users to choose in the sidebar.

Meet Sohu, the fastest AI chip of all time. - Etched @X [Link]

Sohu is building the fastest specialized chip for transformer models.

Gemma 2 is now available to researchers and developers - Google Developers [Link]

Random words:

What's true about New York City: People come and go, they don't stay.

Back to the topic:

When we talk about investment, we talk about economic values. The current situation of AI is very similar to Cisco’s in 2000. Cisco as an internet company spread the capacity of the World Wide Web, but sooner people realized that there is no economic value in internet company, instead, opportunities are in e-commerce etc. AI is a tool very similar to web tech. Currently, with heightened expectations, people are allocating investments and capital expenditure in AI model development, however, end-user demand is unclear and revenue is relatively minimal. This situation makes AI look like a bubble from a very long term perspective.

Stepping closer to it, there is still room in the market to party. GPUs for training and inference are increasingly on demand. First round of beneficiaries are Cloud and Ad. Second round could be hardware or something else. Although it looks like a Capitalism’s scam which is getting more money to the big tech, as small open-source models are released, moats are expected to be disintegrated and distributed. I’ve seen more and more enterprises going to have Gen AI integrated to their business or operation now. Enterprise is going to be continuously transformed to be more efficient and productive, as well as human life with this long lasting attention on AI. This kind of long lasting attention and consistent innovation are something different from internet tech in 2000 and will probably create a momentum against bubble.

Substack

To me, the best model going forward is going to be based on the weighted performance per parameter and training token count. Ultimately, a model keeps getting better the longer you train it. Most open model providers could train longer, but it hasn’t been worth their time. We’re starting to see that change.

The most important models will represent improvements in capability density, rather than shifting the frontier.

In some ways, it’s easier to make the model better by training longer compared to anything else, if you have the data.

The core difference between open and closed LLMs on these charts is how undertrained open LLMs often are. The only open model confirmed to be trained on a lot of tokens is DBRX.

― The End of the “Best Open LLM” - Interconnects [Link]

Good analysis of the direction of open LLM development in 2023 and 2024. In 2023, models were progressing in MMLU by leveraging more compute budgets to handle scaled active parameters and training tokens. In 2024, the progressing direction is slightly changed to be orthogonal to previous - which is improving on MMLU while keeping compute budgets constant.

The companies that have users interacting with their models consistently have moats through data and habits. The models themselves are not a moat, as I discussed at the end of last year when I tried to predict machine learning moats, but there are things in the modern large language model (LLM) space that open-source will really struggle to replicate. Concretely, that difference is access to quality and diverse training prompts for fine-tuning. While I want open-source to win out for personal philosophical and financial factors, this obviously is not a walk in the park for the open-source community. It’ll be a siege of a castle with, you guessed it, a moat. We’ll see if the moat holds.

― Model commoditization and product moats - Interconnects [Link]

The goal of promoting scientific understanding for the betterment of society has a long history. Recently I was pointed to the essay The Usefulness of Useless Knowledge by Abraham Flexner in 1939 which argued how basic scientific research without clear areas for profit will eventually turn into societally improving technologies. If we want LLMs to benefit everyone, my argument is that we need far more than just computer scientists and big-tech-approved social scientists working on these models. We need to continue to promote openness to support this basic feedback loop that has helped society flourish over the last few centuries.

The word openness has replaced the phrase open-source among most leaders in the open AI movement. It’s the easiest way to get across what your goals are, but it is not better in indicating how you’re actually supporting the open ecosystem. The three words that underpin the one messy word are disclosure (the details), accessibility (the interfaces and infrastructure), and availability (the distribution).

― We disagree on what open-source AI should mean - Interconnects [Link]

Google: “A Positive Moment” [Link]

The report of Google Search’s death is exaggerated so far. In fact, search advertising has grown faster at Google than at Microsoft. User searching behavior is harder to change than people expected. Also, Google is leading the development of AI powered tools for Search: 1) “circle to search” is feature allowing a search from an image, text, or video without switching apps. 2) “Point your camera, ask a question” is a feature allowing for multisearch with both images and text for complex questions given an image to the tool. Overall, SGE (Search Generative Experience) is revolutionizing search experience (“10 blue links”) by introducing a dynamic AI-enhanced experience. So far from I observed AI powers Google Search rather than weakens it.

Amazon: Wild Margin Expansion - App Economy Insights [Link]

Amazon’s margin expansion: AWS hit $100 B run rate with a 38% operating margin; Ads is surging; delivery costs have been reduced.

The biggest risk is not correctly projecting demand for end-user AI consumption, which would threaten the utilization of the capacity and capital investments made by tech firms today. This would leave them exposed at the height of the valuation bubble, if and when it bursts, just like Cisco’s growth story that began to unravel in 2000. After all, history may not repeat, but it often rhymes.

At the Upfront Ventures confab mentioned earlier, Brian Singerman, a partner at Peter Thiel’s Founders Fund, was asked about contrarian areas worth investing in given the current landscape. His response: “Anything not AI”.

― AI’s Bubble Talk Takes a Bite Out Of The Euphoria - AI Supremacy [Link]

When we talk about investment, we talk about economic values. Current situation of AI is very similar to Cisco’s in 2000. Cisco as an internet company spread the capacity of the World Wide Web, but sooner people realized that there is no economic value in internet company, instead, opportunities are in e-commerce etc. AI is a tool very similar to web tech. Currently, with heightened expectations, people are allocating investments and capital expenditure in AI model development, however, end-user demand is unclear and revenue is relatively minimal. This situation makes AI look like a bubble from a very long term perspective.

Steve Jobs famously said that Apple stands at the intersection of technology and liberal arts. Apple is supposed to enhance and improve our lives in the physical realm, not to replace cherished physical objects indiscriminately.

― Apple’s Dystopian iPad Video - The Rational Walk Newsletter [Link]

Key pillars of the new strategy (on gaming):

  • Expanding PC and cloud gaming options.
  • Powerful consoles (still a core part of the vision).
  • Game Pass subscriptions as the primary access point.
  • Actively bringing Xbox games to rival platforms (PS5, Switch).
  • Exploring mobile gaming with the potential for handheld hardware.

Microsoft’s “every screen is an Xbox” approach is a gamble and may take a long time to pay off. But the industry is bound to be device-agnostic over time as it shifts to the cloud and offers cross-play and cross-progression. It’s a matter of when not if.

― Microsoft: AI Inflection - App Economy Insights [Link]

Highlights: Azure’s growth accelerated sequentially thanks to AI services and was the fastest-growing of the big three (Amazon AWS, Google Cloud, Microsoft Azure). On Search, Microsoft is losing market share to Alphabet. Capex on AI grows roughly 80% YoY. On gaming, it’s diversifying approaches from selling consoles. Copilot and the Office succeed with Enterprise customers.

To founders, my advice is to remain laser-focused on building products and services that customers love, and be thoughtful and rational when making capital allocation decisions. Finding product-market fit is about testing and learning from small bets before doubling down, and it is often better to grow slower and more methodically as that path tends to lead to a more durable and profitable business. An axiom that doesn’t seem to be well understood is that the time it takes to build a company is also often its half-life.

― 2023 Annual Letter - Chamath Palihapitiya [Link]

This is a very insightful letter about how economic and tech trends of 2023 have shaped their thinking and investment portfolio. What I have learned from this letter:

  1. Tech industry has shifted their focus from unsustainable “growth at any cost” to more prudent forms of capital allocation. This results in laying off employees and slashing projects that are not relevant to the core business.

  2. Rising of interest rate is one of the reasons of bank crisis. During zero interest rate decade, banks sought higher rates of return by purchasing longer duration assets while the value of them are negatively correlated to interest rate. As those caused losses are known by the public, a liquidity crisis ensued.

  3. The advancement of Gen AI has lowered the barriers of starting a software company, and lowered capital requirement in Bio Tech and material sciences, and changed the process of building companies fundamentally, and empowered new entrants to challenge established businesses.

    • The key question is: where will value creation and capture take place? when and where should capital be allocated and company should be started? Some author’s opinions:

      • It’s premature to declare winners now. Instead, author suggested people should deeply understand the underlying mechanisms that will be responsible for value creation over next few years.

      • There are at least two areas of value creation now

        1. Proprietary data

          Example: recent partnership between Reddit and Google

        2. Infrastructure used to run AI application

          For apps built on top of language models, responsiveness is a critical lynchpin. However GPUs are not well-suited to run inference.

          Example: Author’s investment in Groq’s LPU for inference

  4. Heightened geopolitical tensions due to Russia-Ukraine conflict, Israel and Hamas, escalating tensions between China and Taiwan, resulted in a de-globalization trend and also a strategic shift in the US. US legislative initiatives aims to fuel a domestic industrial renaissance by incentivizing reshoring and fostering a more secure and resilient supply chain. They include CHIPS Act, Infrastructure Investment, Job Act, Inflation Reduction Act, etc.

    • The author highlights the opportunity for allocators and founders: companies can creatively and strategically tap into different pools of capital-debt, equity, and government funding.

OpenAI’s strategy to get its technology in the hands of as many developers as possible — to build as many use cases as possible — is more important than the bot’s flirty disposition, and perhaps even new features like its translation capabilities (sorry). If OpenAI can become the dominant AI provider by delivering quality intelligence at bargain prices, it could maintain its lead for some time. That is, as long as the cost of this technology doesn’t drop near zero.

A tight integration with Apple could leave OpenAI with a strong position in consumer technology via the iPhone and an ideal spot in enterprise via its partnership with Microsoft.

― OpenAI Wants To Get Big Fast, And Four More Takeaways From a Wild Week in AI News - Big Technology [Link]

As GPT-4o is 2x faster and 50% cheaper, this discourages competitors to develop LLMs to compete and encourages companies to build with OpenAI’s model for their business. This shows that OpenAI wants to get big fast. However, making GPT-4o free disincentivizes users from subscribing the Plus version.

There is a tight and deep bond between OpenAI and Apple. The desktop app has been debuted on Mac and Apple will build OpenAI’s GPT Tech into mobile iOS.

“You can borrow someone else’s stock ideas but you can’t borrow their conviction. True conviction can only be obtained by trusting your own research over that of others. Do the work so you know when to sell. Do the work so you can hold. Do the work so you can stand alone.”

Investing isn’t about blindly following the herd. It’s about carving your own path, armed with knowledge, patience, and a relentless pursuit of growth and learning.

― Hedge Funds’ Top Picks in Q1 - App Economy Insights [Link]

As I’ve dug into this in more detail, I’ve become convinced that they are doing something powerful by searching over language steps via tree-of-thoughts reasoning, but it is much smaller of a leap than people believe. The reason for the hyperbole is the goal of linking large language model training and usage to the core components of Deep RL that enabled success like AlphaGo: self-play and look-ahead planning.

To create the richest optimization setting, having the ability to generate diverse reasoning pathways for scoring and learning from is essential. This is where Tree-of-Thoughts comes in. The prompting from ToT gives diversity to the generations, which a policy can learn to exploit with access to a PRM.

Q seems to be using PRMs to score Tree of Thoughts reasoning data that then is optimized with Offline RL. This wouldn’t look too different from existing RLHF toolings that use offline algorithms like DPO or ILQL that do not need to generate from the LLM during training. The ‘trajectory’ seen by the RL algorithm is the sequence of reasoning steps, so we’re finally doing RLHF in a multi-step fashion rather than contextual bandits!

Let’s Verify Step by Step: a good introduction to PRMs.

― The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data - Interconnects [Link]

It’s well known on the street that Google DeepMind has split all projects into three categories: Gemini (the large looming model), Gemini-related in 6-12months (applied research), and fundamental research, which is oddly only > 12 months out. All of Google DeepMind’s headcount is in the first two categories, with most of it being in the first.

Everyone on Meta’s GenAI technical staff should spend about 70% of the time directly on incremental model improvements and 30% of the time on ever-green work.

A great read from Francois Chollet on links between prompting LLMs, word2vec, and attention. One of the best ML posts I’ve read in a while.

Slides from Hyung Won Chung’s (OpenAI) talk on LLMs. Great summary of intuitions for the different parts of training. The key point: We can get further with RLHF because the objective function is flexible.

― The AI research job market shit show (and my experience) - Interconnects [Link]

10 Lessons From 2024 Berkshire Hathaway Annual Shareholder Meeting - Capitalist Letters [Link]

What I’ve learned from this article:

  1. Why did Berkshire trimmed its APPL position?

    No concern about Apple’s earnings potential, make sense to take some profits as value is now too high.

  2. Right way to look at share buybacks

    A business should pay dividends only if it cannot make good use of the excess capital it has. Good use capital means the Return of Equity, which is on average 12% for American companies. If the company is able to allocate capital better than shareholders themselves and provide them with above average returns, it should retain the earnings and allocate capital itself.

    Buybacks only makes sense at the right price and buying back shares just to support stock price is not the best action ti take for shareholders. All investment decisions should be price dependent.

  3. How would he invest small sums of money?

    At the time of market crashes or economic downturns, you find exceptional companies trading at ridiculously cheap prices and that’s your opportunity, When you find those companies fairly priced or overvalued and you look for special situations while holding onto your positions in those exceptional companies.

  4. Views on capital allocation

    Study picking businesses, not stocks.

  5. Investing in foreign countries

    America has been a great country for building wealth and capitalist democracy is the best system of governance ever invented.

  6. Advice on job picking

    Remember Steve Jobs’ famous words in the Stanford Commencement speech he gave before his death: “Keep looking, don’t settle!”

  7. On the importance of culture

    In Berkshire culture, shareholders feel themselves as the owners of the businesses. Greg Abel will keep the culture alive in the post-Buffett period and this will automatically attract top talent to a place where they are given full responsibility and trust.

  8. When to sell stocks

    1. A bigger opportunity comes up, 2. something drastically changes in the business, and 3. to raise money
  9. Effects of consumer behavior on investment decisions

    Two types of businesses have durable competitive advantage: 1) Lowest cost suppliers of products and services, 2) suppliers of unique products and services.

  10. How to live a good life? “I’ve written my obituary the way I’ve lived my life”‘ - Charlie Munger

NVIDIA: Industrial Revolution - App Economy Insights [Link]

Primary drivers of Data Center Revenue: 1) Strong demand (up 29% sequentially) for the Hopper GPU computing platform used for training and inferencing with LLMs, recommendation engines, and GenAl apps, 2) InfiniBand end-to-end solutions (down 5% sequentially due to timing of supply) for networking. NVIDIA started shipping the Spectrum-X Ethernet networking solutions optimized for Al.

In the earning call, three major customer categories are provided: 1) cloud service providers (CSPs) including hyperscalers Amazon Microsoft and Google. 2) enterprise usage: Tesla expanded training Al cluster to 35000 H100 GPUs and used NVIDIA Al for FSD V12. 3) consumer internet companies: Meta’s Llama 3 powering Meta Al was trained on a cluster of 24000 H100 GPUs.

Huang explained in the earning call that AI is not a chip problem only but also a systems problem now. They build AI factories.

For further growth, Blackwell platform is coming, Spectrum-X networking is expanding, new software tools like NIMs is developing.

A lot of current research focuses on LLM architectures, data sources prompting, and alignment strategies. While these can lead to better performance, such developments have 3 inter-related critical flaws-

  1. They mostly work by increasing the computational costs of training and/or inference.
  2. They are a lot more fragile than people realize and don’t lead to the across-the-board improvements that a lot of Benchmark Bros pretend.
  3. They are incredibly boring. A focus on getting published/getting a few pyrrhic victories on benchmarks means that these papers focus on making tweaks instead of trying something new, pushing boundaries, and trying to address the deeper issues underlying these processes.

― Revolutionizing AI Embeddings with Geometry [Investigations] - Devansh [Link]

Very few AI research work don’t have # 1 and # 3 flaws and they are really good hard-core work. Time is required to verify whether they are generalizable, widely applicable or not. Especially nowadays the process of scientific research is very different from previous years where there was usually a decade between starting your work and publishing it.

This article highlights some publications in complex embedding and looked into how they improved embeddings by using complex numbers. Current challenges in embedding are 1) sensitivity to outliers 2) limited capacity in capture complex relationship in unstructured text, 3) inconsistency in pairwise rankings of similarities, and 4) computational cost. The next generation complex embedding is benefitting from the following pillars: 1) complex geometry provides richer space to capture nuanced relationships and handle outliers, 2) orthogonality allows each dimension to be independent and distinct, 3) contrastive learning can be used to minimize the distance between similar pairs and maximize the distance between dissimilar pairs. Complex embeddings have a lot of advantages: 1) increasing representation capacity with two components (real and imaginary) of complex numbers, 2) complex geometry allows for orthogonality and thus improves generalization, and also allows use to reach stable convergence quickly, 3) robust features can be captured which improves robustness, and 4) solved limitation of cosine similarity (saturation zones which lead to vanishing gradients during optimization) by angle optimization in complex space.

Llama 3 8B might be the most interesting all-rounder for fine-tuning as it can be fine-tuned no a single GPU when using LoRA.

Phi-3 is very appealing for mobile devices. A quantized version of it can run on an iPhone 14.

― How Good Are the Latest Open LLMs? And Is DPO Better Than PPO? [Link]

Good paper review article. Highlights key discussions:

  • Mixtral 8x22B: The key idea is to replace each feed-forward module in a transformer architecture with 8 expert layers. It achieves lower active parameters (cost) and higher performance (MMLU).

  • Llama 3: The main difference between Llama 3 and Llama 2 are 1) vocab size has been increased, 2) used grouped-query attention, 3) used both PPO & DPO. The key research finding is that the more data the better performance, no matter what model size is.

    “Llama 3 8B might be the most interesting all-rounder for fine-tuning as it can be fine-tuned no a single GPU when using LoRA.”

  • Phi-3: Key characteristics are 1) it’s based on Llama architecture, 2) trained on 5x fewer tokens than Llama 3, 3) used the same tokenizer with a vocab size of 32064 as Llama2, much smaller than Llama 3 vocab size, 4) has only 3.8B parameters, less than half the size of Llama 3 8B, 5) secret sauce is dataset quality over quantity - it’s trained on heavily filtered web data and synthetic data.

    “Phi-3 is very appealing for mobile devices. A quantized version of it can run on an iPhone 14.”

  • OpenELM: key characteristics are 1) 4 relatively small sizes: 270M, 450M,1.1B, and 3B, 2) instruct version trained with rejection sampling and DPO, 3) slightly better than OLMo in performance, even though trained on 2x fewer tokens, 4) main architecture teak - a layer-wise scaling strategy, 5) sampled a relatively smaller subset of 1.8T tokens from various public datasets, but no clear rationale for subsampling, 6) one main research finding is that there is no clear difference between LoRA and DoRA for parameter efficient fine-tuning.

    About the layer-wise scaling strategy: 1) there are N transformer blocks in a model, 2) layers are gradually widened from the early to the later transformer blocks, so for each block: a) number of heads are increased, b) dimension of each layer is increased.

  • DPO vs PPO: The main difference between DPO and PPO is that “DPO does not require training a separate reward model but uses a classification-like objective to update LLM directly”.

    Key findings of the paper and best practices suggested: 1) PPO is generally better than DPO if you use it correctly. DPO suffers from out-of-distribution data, which means instruction data is different from preference data. The solution could be to “add a supervised instruction fine-tuning round on the preference dataset before following up with DPO fine-tuning.”, 2) If you use DPO, make sure to perform SFT on preference data first, 3) “iterative DPO which involves labeling additional data with an existing reward model is better than DPO on existing preference data.”, 4) “If you use PPO, the key is to use large batch sizes, advantage normalization, and parameter update via exponential moving average.”, 5) though PPO is generally better, DPO is more straightforward and will still be a popular go-to option, 6) both can be used. Recall the pipeline behind Llama3: pretraining -> SFT -> rejection sampling -> PPO -> DPO.

Google I/O AI keynote updates 2024 - AI Supremacy [Link]

Streaming Wars Visualized - App Economy Insights [Link]

This Week in Visuals - App Economy Insights [Link]

Gig Economy Shakeup - App Economy Insights [Link]

Articles

Musings on building a Generative AI product - LinkedIn Engineering Blog [Link]

This is a very good read about developing Gen AI product for business by using pre-trained LLM. This article elaborates how this product is designed, how each part works specifically, what works and what does not work, what has been improving, and what has been struggling. Some takeaways for me are

  1. Supervised fine tuning step was done by embedding-based retrieval (EBR) powered by an in-memory database to inject response examples into prompts.

  2. An organizational structure was designed to ensure communication consistency: one horizontal engineering pod for global templates and styles, and several vertical engineering pods for specific tasks such as summarization, job fit assessment, interview tips, etc.

  3. Tricky work:

    1. Developing end to end automatic evaluation pipeline.

    2. Skills in dynamically discover and invoke APIs / agents.

      This requires input and output to be ‘LLM friendly’ - JSON or YAML schemes.

    3. Supervised fine tuning by injected responses of internal database.

      As evaluation becoming more sophisticated, prompt engineering needs to be improved to reach high quality/evaluation scores. The difficulty is that quality scores shoot up fast then plateau so it’s hard to reach a very high score in the late improvement stage. This makes prompt engineering more like an art rather than science.

    4. Tradeoff of capacity and latency

      Chain of Thoughts can improve quality and accuracy of responses, but increase latency. TimeToFirstToken (TTFT) & TimeBetweenTokens (TBT) are important to utilization but need to be bounded to limit latency. Besides, they also intend to implement end to end streaming and async non-blocking pipeline.

The concept of open source was devised to ensure developers could use, study, modify, and share software without restrictions. But AI works in fundamentally different ways, and key concepts don’t translate from software to AI neatly, says Maffulli.

But depending on your goal, dabbling with an AI model could require access to the trained model, its training data, the code used to preprocess this data, the code governing the training process, the underlying architecture of the model, or a host of other, more subtle details.

Which ingredients you need to meaningfully study and modify models remains open to interpretation.

both Llama 2 and Gemma come with licenses that restrict what users can do with the models. That’s anathema to open-source principles: one of the key clauses of the Open Source Definition outlaws the imposition of any restrictions based on use cases.

All the major AI companies have simply released pretrained models, without the data sets on which they were trained. For people pushing for a stricter definition of open-source AI, Maffulli says, this seriously constrains efforts to modify and study models, automatically disqualifying them as open source.

― The tech industry can’t agree on what open-source AI means. That’s a problem. ― MIT Technology Review [Link]

This article argues that the definitions of open-source AI are problematic. ‘Open’ models either have restriction on usage or don’t release details of training data. This does not fit traditional definition of ‘open source’. However, people argue that for the special case of AI, we need different definition of open source. As long as the definition remains vague, it’s problematic, because big tech will define open-source AI to be what suits it.

Everything I know about the XZ backdoor [Link]

Some great high-level technical overview of XZ backdoor [Link] [Link] [Link] [Infographic] [Link] [Link]

A backdoor in xz-utils (used for lossless compression) was recently revealed by Andres Freund (Principle SDE at Microsoft). The backdoor only shows up when a few specific criteria are met at least: 1) running a distro that uses glibc, 2) have version 5.6.0 or 5.6.1 xz installed or liblzma installed. There is a malicious script called build-to-host.m4 which checks for various conditions like the architecture of the machine. If those conditions check, the payload is injected into the source tree. The intention of payload is still under investigation. Lasse Collin, one of the maintainer of the repo, has posted an update and is working on carefully analyzing the situation. The author Evan Boehs in the article present a timeline of the attack and online investigators’ discoveries of Jia Tan identity (from IP address, LinkedIn, commit timings, etc), and raises our awareness of the human costs of open source.

Having a crisp mental model around a problem, being able to break it down into steps that are tractable, perfect first-principle thinking, sometimes being prepared (and able to) debate a stubborn AI — these are the skills that will make a great engineer in the future, and likely the same consideration applies to many job categories.

― Why Engineers Should Study Philosophy ― Harvard Business Review [Link]

Human comes into a new stage of learning: smartly asking AI questions to get answers as accurate as possible. So prompt engineering is a very important skill in AI era. In order to master prompt engineering, we need to have divide and conquer mindset, perfect first-principle thinking, critical thinking, and skepticism.

If we had infinite capacity for memorisation, it’s clear the transformer approach is better than the human approach - it truly is more effective. But it’s less efficient - transformers have to store so much information about the past that might not be relevant. Transformers (🤖) only decide what’s relevant at recall time. The innovation of Mamba (🐍) is allowing the model better ways of forgetting earlier - it’s focusing by choosing what to discard using Selectivity, throwing away less relevant information at memory-making time.

― Mamba Explained [Link]

A very in-depth explanation of Mamba architecture. So the main difference between Transformer and Mamba is that Transformer stores all past information and decides what is relevant at recall time. While Mamba uses Selectivity to decide what to discard earlier. Mamba ensures both efficiency and effectiveness (space complexity reduces from O(n) to O(1), time complexity reduces from O(n^2) to O(n)). If Transformer has high effectiveness and low efficiency due to large state, and RNN has high efficiency and low effectiveness due to small state, Mamba is in between - Mamba selectively and dynamically compress data into the state.

The Power of Prompting ― Microsoft Research Blog [Link]

Basically this study demonstrates that GPT-4 is able to outperform a leading model that was fine-tuned specifically for medical application by Medprompt - a composition of several prompting strategies. This shows that fine-tuning might not be necessary in the future though it can boost performance, it is resource-intensive and cost-prohibitive. Simple prompting strategies could serve to transform generalist models into specialists and extending benefits of models to new domains and applications. Similar study was also done in finance domain by JP Morgan with similar results.

Previously, we made some progress matching patterns of neuron activations, called features, to human-interpretable concepts. We used a technique called “dictionary learning”, borrowed from classical machine learning, which isolates patterns of neuron activations that recur across many different contexts.

In turn, any internal state of the model can be represented in terms of a few active features instead of many active neurons. Just as every English word in a dictionary is made by combining letters, and every sentence is made by combining words, every feature in an Al model is made by combining neurons, and every internal state is made by combining teatures.

The features are likely to be a faithful part of how the model internally represents the world, and how it uses these representations in its behavior.

― Mapping the Mind of a Large Language Model - Anthropic [Link]

This is an amazing work towards AI safety by Anthropic. The main goal is to understand the inner workings of AI models and identify how millions of concepts are represented inside Claude Sonnet, so that developers can better control AI safety. Previous progress of this work was to match pattern of neuron activations (“features”) to human-interpretable concepts by technique called “dictionary learning”. Now they are scaling up the technique to the vastly larger AI language models. Below is a list of key experiments and findings.

  1. Extracted millions of features from the middle layer of Claude 3.0 Sonnet. Features have a depth, breadth, and abstraction reflecting Sonnet’s advanced capabilities.
  2. Find more abstract features - responding to bugs in code, discussion of gender biases in professions, etc.
  3. Measure a “distance” between features based on which neurons appeared in their activation patterns. They find that features with similar concept are close to each other. This demonstrates internal organization of concepts in AI model correspond to human notions of similarity.
  4. By artificially amplifying or suppressing features, they see how Claude’s responses change. This shows that features can be used to change how a model acts.
  5. For the purpose of AI safety, they find features corresponding to the capabilities with misuse potential (code backdoors, developing bio-weapons), different forms of biases (gender discrimination, racist claims about crime), and potentially problematic AI behavior (power-seeking, manipulation, secrecy)
  6. For previous concern about sycophancy, they also find a feature associated with sycophantic praise.

This study proposed a good approach to ensure AI safety: use the technique described here to monitor AI systems for dangerous behaviors and to debias outcomes.

To qualify as a “Copilot+ PC” a computer needs distinct CPUs, GPUs, and NPUs (neural processing units) capable of >40 trillion operations per second (TOPS), and a minimum of 16 GB RAM and a 256 GB SSD.

All of those analysts who assumed Wal-Mart would squish Amazon in e-commerce thanks to their own mastery of logistics were like all those who assumed Microsoft would win mobile because they won PCs. It turns out that logistics for retail are to logistics for e-commerce as operating systems for a PC are to operating systems for a phone. They look similar, and even have the same name, but require fundamentally different assumptions and priorities.

I then documented a few seminal decisions made to demote windows, including releasing Office on iPad as soon as he took over, explicitly re-orienting Microsoft around services instead of devices, isolating the Windows organization from the rest of the company, killing Windows Phone, and finally, in the decision that prompted that Article, splitting up Windows itself. Microsoft was finally, not just strategically but also organizationally, a services company centered on Azure and Office; yes, Windows existed, and still served a purpose, but it didn’t call the shots for the rest of Microsoft’s products.

That celebration, though, is not because Windows is differentiating the rest of Microsoft, but because the rest of Microsoft is now differentiating Windows. Nadella’s focus on AI and the company’s massive investments in compute are the real drivers of the business, and, going forward, are real potential drivers of Windows.

This is where the Walmart analogy is useful: McMillon needed to let e-commerce stand on its own and drive the development of a consumer-centric approach to commerce that depended on centralized tech-based solutions; only then could Walmart integrate its stores and online services into an omnichannel solution that makes the company the only realistic long-term rival to Amazon.

Nadella, similarly, needed to break up Windows and end Ballmer’s dreams of vertical domination so that the company could build a horizontal services business that, a few years later, could actually make Windows into a differentiated operating system that might, for the first time in years, actually drive new customer acquisition.

― Windows Returns - Stratechery [Link]

Chatbot Arena results are in: Llama 3 dominates the upper and mid cost-performance front (full analysis) ― Reddit [Link]

Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora [Link]

YouTube and Podcasts

I don’t have an answer to peace in the Middle East, I wish I did, but I do have a very strong view that we are not going to get to peace when we are apologizing or denying crimes against humanity and crime mass rape of women. That’s not the path to peace, the path to peace is not saying this didn’t happen, the path to peace is saying this happened no matter what side of the fence you are on no matter what side of the world you are on, if you are the far right the far left, anywhere on the world, we are not going to let this happen again and we are going to get to peace to make sure. - Sheryl Sandberg

― In conversation with Sheryl Sandberg, plus open-source AI gene editing explained - All-In Podcast [Link]

U.N. to Study Reports of Sexual Violence in Israel During Oct. 7 Attack [Link]

Western media concocts ‘evidence’ UN report on Oct 7 sex crimes failed to deliver [Link]

It’s crazy that what is happening right now in some of the colleges is not to protest sexual violence as a tool of war by Hamas. This kind of ignorance or denial of sexual violence is horrible. People are so polarized to black and white that if something does not fit into their view, they are going to reject it. There are more than two sides to the Middle East story, one of them is sexual violence - mass rape, genital mutilation of men and women, women tied to trees naked bloody leg spread…

There is a long history of the involvement of women’s bodies in Wars. It’s only 30 years ago, people started to say rape is not a tool of War and should be prosecuted as a war crime against humanity. The feminist, human rights, and civil rights groups made this happen. Now it happened again in Gaza according to the report released by U.N., however there are a lot difficulties in proving and testifying the truth e.g. they couldn’t locate a single victim, or they don’t have the victim rights to take pictures. But victims are dead and they cannot speak up. Denying the fact of sexual violence is just unacceptable. And there is such a great documentary shedding lights on the unspeakable sexual violence committed on Oct 7, 2023 that I think everyone should watch.

Good news is that the testimony of eyewitness meets the criteria of any international or global court. So crimes can be proven by any eyewitness for sure.

John Schulman - Reinforcement Learning from Human Feedback: Progress and Challenges [Link]

John Schulman is a research scientist and cofounder of OpenAI, focusing on Reinforcement Learning (RL) algorithms. He gave a talk on making AI more truthful on Apr 24, 2023 in UCB. The ideas and discussions are still helpful and insightful today.

In this talk, John discussed the issue of hallucination with large language models. He claims that behavior cloning or supervised learning is not enough to fix the hallucination problem, instead, reinforcement learning from human feedback (RLHF) can help improve the model’s truthfulness by 1) adjusting output distribution so model is allowed to express uncertainty, challenge premise, admit error, and 2) learning behavior boundaries. In his conceptual model, fine-tuning leads the model to hallucinate when it lacks knowledge. Retrieval and citing external sources can help improve verifiability. John discusses models that can browse the web to answer technical questions, citing relevant sources.

John mentioned three open problems in LLM: 1) how to train models to express uncertainty in natural language, 2) go beyond what human labelers can easily verify (“scalable oversight”), and 3) optimizing for true knowledge rather than human approval.

The 1-Year Old AI Startup That’s Rivaling OpenAI — Redpoint’s AI Podcast [Link]

A great interview with the CEO of Mistral Arthur Mensch on the topic of sovereignty and open models as a business strategy. Here are some highlighted points from Arthur:

  1. Open-source is going to solidify in the future. It is an infrastructure technology and at the end of the day it should be modifiable and owned by customers. Now Mistral has two offerings, open source one and commercial one, and the aim is to find out the business model to sustain the open source development.
  2. The things that Mistral is best at 1) training model, and 2) specializing models.
  3. The way they think about partnership strategy is to look at what enterprises would need, where they were operating, where the developers were operating, and figure out the channels that would facilitate adoption and spread. To be a multiplatform solution and to replicate the solution to different platforms is a strategy that Mistral is following.
  4. There is still an efficiency upper bound to be pushed. Other than compute to spend on pre-training, there is still research to do on improving model efficiency and strength. On architecture side, we can be more efficient than plain Transformer which spends same amount of compute on every token. Mistral is making model faster. By making model faster, we open up a lot of applications that involve an LLM as a basic brick and then we can figure out how to do planning, explorations, etc. By increasing efficiency, we open up areas of research.
  5. Meta has more GPUs than Mistral do. But Mistral has a good concentration of GPU (number of GPU per person). This is the way to be as efficient as possible to come up with creative ways of training models. Also unit economics need to be considered to make sure that \(\$1\) that you spend on training compute eventually accrues to more than \(\$1\) revenue.
  6. Transformer is not an optimal architecture. It’s been out there for 7 years now. Everything is co-adapted to it such as training methods, debug methods, the algorithms, and hardware. It’s challenging to find a better one and also beat the baseline. But there are a lot of research on modification of attention to boost memory efficiencies and a lot of things can be done in that direction and similar directions.
  7. About AI regulations and EU AI Act, Arthur states that it does not solve the actual problem of how to make AI safe. Because making AI safe is a hard problem (stochastic model), different from the way we evaluate software before. It’s more like a product problem rather than a regulation problem. We need to rethink continuous integration, verifications, etc and make sure everything is happening as it should be.
  8. Mistral recently released Le Chat to help enterprise start incorporating AI. It gives an assistant that is contextualized on their enterprise data. It’s a tool to be closer to the end user to get feedback for the developer platform and also a tool to get the enterprise into GenAI.

Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI [Link]

Synthetic data is the next rage of LLM. Soumith pointed out that synthetic data is where we as humans already have good symbolic models off, we need to impart that knowledge to neural networks, and we figured out the synthetic data is a vehicle to impart this knowledge to it. Related to synthetic data but in an unusual way, there is new research on distilling GPT-4 by creating synthetic data from GPT-4, creating mock textbooks inspired by Phi-2 and then fine tuning open source models like Lambda.

Open source means different things to different people and we haven’t had a community norm definition yet at this very early stage of LLM. When being asked about open source, people in this field are used to highlight the definition of it in advance. In the open source topic, Soumith pointed out that the most beneficial value of open is it makes the distribution very wide and available with no friction so that people can do transformative things in a way that is very accessible.

Berkshire Hathaway 2024 Annual Meeting Movie: Tribute to Charlie Munger [Link]

First year that the annual meeting movie is made public. First year that the annual meeting is without Charlie. Already started to miss his jokes.

I think the reason why the car could have been completely reimagined by Apple is that they have a level of credibility and trust that I think probably no other company has, and absolutely no other tech company has. I think this was the third Steve Jobs story that I left out but in 2001, I launched a 99 cent download store and Steve Jobs just ran total circles around us, but the reason he was able to is he had all the credibility to go to the labels and get deals done for licensing music that nobody could get done before. I think that is an example of what Apple’s able to do which is to use their political capital to change the rules. So if the thing that we could all want is safer roads and autonomous vehicles, there are regions in every town and city that could be completely converted to level 5 autonomous zones. If I had to pick one company that had the credibility to go and change those rules, it’s them. Because they could demonstrate that there was a methodical safe approach to doing something. So the point is that even in these categories that could be totally reimagined, it’s not for a lack of imagination, again it just goes back to a complete lack of will. I understand because if you had 200B dollars of capital on your balance sheet, I think it’s probably easy to get fat and lazy. - Chamath Palihapitiya

― In conversation with Sam Altman — All-In Podcast [Link]

If you are a developer, the key thing to understand is where does model innovation end and your innovation begin, because if you get that wrong you will end up doing a bunch of stuff that the model will just obsolete in a few months. - David Sacks

The incentive for these folks is going to be push this stuff into the open source. Because if you solve a problem that’s operationally necessary for your business but it isn’t the core part of your business, what incentive do you have to really keep investing in this for the next 5 to 10 years to improve it. You are much better off release it in the open source, let the rest of the community take it over so that it’s available to everybody else, otherwise you are going to be stuck supporting it, and then if and when you ever wanted to switch out a model, GPT-4o, Claude, Llama, it’s going to be costly. The incentive to just push towards open source in this market if you will is so much meaningful than any other market. - Chamath Palihapitiya

I think the other thing that is probably true is a big measure at Google on the search page in terms of search engineer performance was the bounceback rate, meaning someone does a search, they go off to another site and they come back because they didn’t get the answer they wanted. Then one box launched which shows a short answer on the top, which basically keeps people from having a bad search experience, because they get the result right away. So a key metric is they are going to start to discover which vertical searches will provide the user a better experience than them jumping off to a third party page to get the same content. And then they will be able to monetize that content that they otherwise were not participating in the monetization of. So I think the real victim in all this is that long tale of content on the internet that probably gets cannibalized by the snippet one box experience within the search function. And then I do think that the revenue per search query in some of those categories actually has the potential to go up not down. You keep people on the page so you get more search volume there, you get more searches because of the examples you gave. And then when people do stay, you now have the ability to better monetize that particular search query, because you otherwise would have lost it to the third party content page. Keeping more of the experience integrated they could monetize the search per query higher and they are going to have more queries, and then they are going to have the quality of the queries go up. Going back to our earlier point about precision vs accuracy, my guess is there’s a lot of hedge fund type folks doing a lot of this Precision type of analysis trying to break apart search queries by vertical and try to figure out what the net effect will be of having better AI driven box and snippets. And my guess is that is why there is a lot of buying activity happening. I can tell you Meta and Amazon do not have an Isomorphic Lab and Waymo sitting inside their business, that suddenly pops to a couple hundred billion of market cap and Google does have a few of those. - David Friedberg

One thing I would say about big companies like Google or Microsoft is that the power of your monopoly determines how many mistakes you get to make. So think about Microsoft completely missed iPhone, remember they screwed up the whole smartphone era and it didn’t matter. Same thing here with Google, they completely screwed up AI. They invented the Transformer, completely missed LLMs. Then they had that fiasco where they have black George Washington. It doesn’t matter, they can make 10 mistakes but their monopoly is so strong, that they can finally get it right by copying the innovator, and they are probably going to be come 5T dollar company. - David Sacks

― GPT-4o launches, Glue demo, Ohalo breakthrough, Druckenmiller’s bet, did Google kill Perplexity? — All-In Podcast [Link]

Great conversations and insightful discussions as usual. Love it.

When you are over earning so massively, the rational thing to do for other actors in the arena is to come and attack that margin, and give it to people for slightly cheaper slightly faster slightly better so you can take share. So I think what you’re seeing and what you will see even more now is this incentive for Silicon Valley who has been really reticent to put money into chips, really reticent to put money into hardware. They are going to get pulled into investing this space because there is no choice. - Chamath Palihapitiya

Why? It’s not that intel was a worse company, but it’s that everything else caught up. And the economic value went to things that sat above them in the stack, then it want to Cisco for a while right, then after Cisco, it went to the browser companies for a little bit, then it went to the app companies, then it went to the device companies, then it went to the mobile companies. So you see this natural tendency for value to push up the stack over time. For AI, we’ve done the step one which is now you’ve given all this value to NVIDIA and now we are going to see it being reallocated. - Chamath Palihapitiya

The reason why they are asking these questions is that if you go back to the doom dot come boom in 1999, you can see that Cisco had this incredible run. And if you overlay the stock price of Nvidia, it seems to be following that same trajectory. And what happened with Cisco is that when the doc come crash came in 2000, Cisco stock lost a huge part of its value. Obviously Cisco is still around today and it’s a valuable company, but it just hasn’t ever regained the type of market cap it had. The reason this happened is because Cisco got commoditized. So the success and market cap of that company attracted a whole bunch of new entrance and they copied Cisco’s products until they were total commodities. So the question is whether that happened to Nvidia. I think the difference here is that at the end of the day Network equipment which Cisco produced was pretty easy to copy, whereas if you look at Nvidia, these GPU cores are really complicated to make. So it’s a much more complicated product to copy. And then on top of that, they are already in the R&D cycle for the next chip. So I think you can make the case that Nvidia has a much better moat than Cisco. - David Sacks

I think Nvidia is going to get pulled into competing directly with the hyperscalers. So if you were just selling chips, you probably wouldn’t, but these are big bulky actual machines, then all of a sudden you are like well why don’t I just create my own physical plant and just stack these things, and create racks and racks of these machines. It’s not a far stretch especially because Nvidia actually has the software interface that everybody uses which is CUDA. I think it’s likely that Nvidia goes on a full frontal assault against GCP and Amazon and Microsoft. That’s going to really complicate the relationship that those folks have with each other, but I think it’s inevitable because how do you defend an enormously large market cap, you are forced to go into businesses that are equally lucrative. Now if I look inside of compute and look at the adjacent categories, they are not going to all of a sudden start a competitor to TikTok or a social network, but if you look at the multi hundred billion revenue businesses that are adjacent to the markets that Nvidia enables, the most obvious ones are the hyperscalers. So they are going to be forced to compete otherwise their market cap will shrink and I don’t think they want that, and then it’s going to create a very complicated set of incentives for Microsoft and Google and Meta and Apple and all the rest. And that’s also going to be an accelerant, they are going to pump so much money to help all of these upstarts. - Chamath Palihapitiya

Economy is bad without recognizing that it is an inflationary experience whereas economists use the definition of “economic growth” being gross product, and so if gross product or gross revenue is going up they are like oh the economy is healthy we are growing. But the truth is we are funding that growth with leverage at the national level the federal level and at the household a domestic level. We are borrowing money to inflate the revenue numbers , and so the GDP goes up but the debt is going higher, and so the ability for folks to support themselves and buy things that they want to buy and continue to improve their condition in life has declined if things are getting worse… The average American’s ability to improve their condition has largely been driven by their ability to borrow not by their earnings. - David Friedberg

Scarlett Johansson vs OpenAI, Nvidia’s trillion-dollar problem, a vibecession, plastic in our balls [Link]

It’s a fun session and it made my day :). Great discussions about Nvidia’s business, America’s negative economic sentiment, harm of plastics, etc.

Building with OpenAI What’s Ahead [Link]

Papers and Reports

Large Language Models: A Survey [Link]

This is a must-read paper if you would like to have a comprehensive overview of SOTA LLMs, technical details, applications, datasets, benchmarks, challenges, and future directions.

Little Guide to Building Large Language Models in 2024 - HuggingFace [Link]

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks [Link]

Bloomberg fine-tuned GPT-3.5 on their financial data only to find that GPT-4 8k, without specialized finance fine-tuning, beat it on almost all finance tasks. So there is really a moat? Number of parameters matters and data size matters, and they all require compute and money.

Jamba: A Hybrid Transformer-Mamba Language Model [Link] [Link]

Mamba paper has been rejected while fruits are reaped fast: MoE-Mamba, Vision Mamba, and Jamba. It’s funny to see the asymmetric impact in ML sometimes, e.g. FlashAttention has <500 citations and is used everywhere. Github repos used by 10k+ has <100 citations, etc.

KAN: Kolmogorov-Arnold Networks [Link] [authors-note]

This is a mathematically beautiful idea. The main difference between traditional MLP and KAN is that KAN has learnable activation function on weights, so all weights in KAN are non-linear. KAN outperforms MLP in accuracy and interpretability. Whether in the future KAN is able to replace MLP depends on whether there could be suitable learning algorithms like SGD, AdamW, etc and whether it will be GPU friendly.

The Platonic Representation Hypothesis [Link]

Interesting paper to read if you like philosophy. This paper argues that there is a platonic representation as a result of convergence of AI models towards a shared statistical model of reality. They show that there is a growing similarity in data representation across different model architectures, training objectives, and data modalities, as the model size, data size, and task diversity are growing. They also proposed three hypothesis for the representation convergence: 1) The multitask scaling hypothesis, 2) The capacity hypothesis, and 3) The simplicity bias hypothesis. And it definitely worths reading the counterexamples and limitations.

Frontier Safety Framework - Google DeepMind [Link]

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model [Link]

One main improvement: Multi-head latent attention via compressed latent KV requires smaller amount of KV cache per token but achieves stronger performance. Heads can be compressed differently (taking different portion of compressed latent states), and keys and values can be compressed differently.

What matters when building vision-language models [Link]

The Unreasonable Ineffectiveness of the Deeper Layers [Link]

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models [Link]

This paper published by Google DeepMind proposes language model called RecurrentGemma that can match or exceed the performance of transformer-based models while being more memory efficient.

Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach - Google’s Tech Report of LearnLM [Link]

Chameleon: Mixed-Modal Early-Fusion Foundation Models [Link]

This paper published by Meta proposed a mixed model which uses Transformer architecture under the covers but applies some innovations such as query-key normalization to fix the imbalance between the text and image tokens and other innovations as well.

Simple and Scalable Strategies to Continually Pre-train Large Language Models [Link]

Tricks for successful continued pretraining:

  1. Re-warming and re-decaying the learning rate.
  2. Adding a small portion (e.g., 5%) of the original pretraining data (D1) to the new dataset (D2) to prevent catastrophic forgetting.
    Note that smaller fractions like 0.5% and 1% were also effective.

Cautious about their validity on model with larger sizes.

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study [Link]

Algorithmic Progress in Language Models [Link]

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws [Link]

Efficient Multimodal Large Language Models: A Survey [Link]

Good overview of multimodal LLMs.

Financial Statement Analysis with Large Language Models [Link]

LoRA Learns Less and Forgets Less [Link]

Lessons from the Trenches on Reproducible Evaluation of Language Models [Link]

Challenges and best practices in evaluating LLMs.

Agent Planning with World Knowledge Model [Link]

GitHub Repo

Google Research Tune Playbook - GitHub [Link]

ML Engineering - GitHub [Link]

LLM from Scratch [Link]

Prompt Engineering Guide [Link] [Link]

ChatML + chat templates + Mistral v3 7b full example [Link]

Finetune pythia 70M [Link]

Llama3 Implemented from Scratch [Link]

News

Intel Inside Ohio [Link]

Intel Ohio One Campus Video Rendering [Link]

Intel Corp has committed \(\$28\)B to build a “mega fab” called Ohio One which could be the biggest chip factory on Earth. The Biden administration has agreed to provide Intel with \(\$19.5\)B in loans and grants to support finance the project.

EveryONE Medicines: Designing Drugs for Rare Diseases, One at a Time [Link]

Startup EveryONE Medicine aims to develop drugs designed based on genetic information for individual children who have rare, life-threatening neurological diseases. Since the number of patients with diseases caused by rare mutation is significant, the market share is large if EveryONE can scale its process. Although the cost won’t be the same as a standard drugmaker that runs large clinical trials, the challenge is safety without a standard clinical-testing protocol. To be responsible to patients, the initial drugs will have a temporary effect and a wide therapeutic window, so the potential toxicity will be minimized or stopped if there is.

Voyager 1’s Communication Malfunctions May Show the Spacecraft’s Age [Link]

In Nov 2023, NASA’s over 46-year-old Voyager 1 spacecraft started sending nonsense to Earth. Voyager 1 was initially intended to study Jupiter and Saturn and was built to survive only 5 years of flight, however the trajectory was forged further and further into space and so the mission converted from a two-planet mission to an interstellar mission.

In Dec 2023, the mission team restarted the Flight Data Subsystem (FDS) but failed to return the subsystem to functional state. On Mar 1 2023, they sent a command “poke” to the probe and received a response on Mar 3. On Mar 10, the mission team finally determined the response carried a readout of FDS memory. By comparing the readout with those received before the issue, the team confirmed that 3% of FDS memory was corrupted. On Apr 4, the team concluded the affected code was contained on a computer chip. To solve the problem, the team decided to divide these affected code into smaller sections and to insert those smaller sections into other operative places in the FDS memory. During Apr 18-20, the team sent out the orders to move some of the affected code and received responses with intelligible systems information.

Editing the Human Genome with AI [Link]

Berkeley based startup Profluent Bio used an AI based protein language model to create and train on an entirely new library of Cas proteins that do not exist in nature today and eventually find one called ‘OpenCRISPR-1’ that is able to replace or improve the ones that are on the market today. The goal of this AI model is to learn what sequence of DNA generated what structure of protein that’s really good at gene editing. The new library of Cas proteins is created by simulation of trillions of letters. They made ‘OpenCRISPR-1’ publicly available under an open source license so anyone can use this particular Cas protein.

Sony and Apollo in Talks to Acquire Paramount [Link]

Paramount’s stock declined 44% in 2022 and another 12% in 2023. It’s experiencing declining revenue as consumers abandon traditional pay-TV and it’s losing streaming business. Berkshire sold its entire Paramount shares in March 2023 and soon Sony Pictures and Apollo Globals Management reached out to Paramount board expressing interest of acquisition. Now Paramount decided to open negotiation with them after exclusive talks with Hollywood studio Skydance. This deal would break the Paramount and potentially transform the media landscape if successful. Otherwise an office of the CEO as the replacement of CEO Bob Bakish will be preparing a long term plan for the company.

AlphaFold 3 predicts the structure and interactions of all of life’s molecules [Link]

Previously, Google DeepMind AlphaFold project took 3D images of proteins and the DNA sequence that codes for those proteins and then they built a predictive model that predicted the 3D structure of protein base on DNA sequence. What is difference in AlphaFold 3 is that all small molecules are included. The way how small molecules are bind together with the protein is part of the predictive model. This is a breakthrough in that off target effect could be minimized by taking consideration of other molecules’ interactions in the biochemistry environment. Google has a drug development subsidiary called Isomorphic Labs. They kept all of IP for AlphaFold 3. They published a web viewer for non-commercial scientists to do fundamental research but only Isomorphic Labs can make it for commercial use.

Introducing GPT-4o and making more capabilities available for free in ChatGPT [Link]

I missed the live announcement but watched the recording. GPT-4o is amazing.

One of the interesting technical difference made is tokenizer delta. GPT-4 and GPT-4-Turbo both had a tokenizer with a vocabulary of 100k tokens. GPT-4o has a tokenizer with 200k tokens to work better for native multimodality and multilingualism. The more tokens the more efficient in generating characters.

“Our goal is to make it effortless for people to go anywhere and get anything,” said Dara Khosrowshahi, CEO of Uber. “We’re excited that this new strategic partnership with Instacart will bring the magic of Uber Eats to even more consumers, drive more business for restaurants, and create more earnings opportunities for couriers.”

Project Astra: Our vision for the future of AI assistants [Link]

Google Keynote (Google I/O 24’) [Link]

This developer conference is about Google’s AI related product updates. Highlighted features: 1) AI Overview for search 2) Ask Photos, 3) 2M context window, 4) Google Workspace, 5) NotebookLM, 6) Project Astra, 7) Imagen 3, 8) Music AI Sandbox, 9) Veo, 10) Trillium TPU, 11) Google Serach, 12) Asking Questions with Videos, 13) Gemini interacting with Gmail and data, 14) Gemini AI Teammate, 15) Gemini App, and upgrades, 16) Gemini Trip Planning.

Leike went public with some reasons for his resignation on Friday morning. “I have been disagreeing with OpenAI leadership about the company’s core priorities for quite some time, until we finally reached a breaking point,” Leike wrote in a series of posts on X. “I believe much more of our bandwidth should be spent getting ready for the next generations of models, on security, monitoring, preparedness, safety, adversarial robustness, (super)alignment, confidentiality, societal impact, and related topics. These problems are quite hard to get right, and I am concerned we aren’t on a trajectory to get there.”

― OpenAI created a team to control ‘superintelligent’ AI — then let it wither, source says [Link]

Other News:

Encampment Protesters Set Monday Deadline for Harvard to Begin Negotiations [Link]

Israel Gaza war: History of the conflict explained [Link]

Cyber Stuck: First Tesla Cybertruck On Nantucket Has A Rough Day [Link]

Apple apologizes after ad backlash [Link]

Apple nears deal with OpenAI to put ChatGPT on iPhone: Report [Link] [Link]

Reddit announces another big data-sharing AI deal — this time with OpenAI [Link]

Apple Will Revamp Siri to Catch Up to Its Chatbot Competitors [Link]

OpenAI strikes deal to bring Reddit content to ChatGPT [Link]

Random words:

Music teacher never answered my question: why should Triangle be included in a piece of music while there are already 10+ instruments and sounds loud in there? I accidentally got the answer from my dance teacher. She said: “different people have different hearing capabilities and thus different understanding of music, what dancers are doing is actually to interpret or reproduce music.”

Back to the topic:

There is always a lot to learn about strategic thinking from Zuck. Here are some of his smart strategies behind open source I’ve learned:

  1. According to this interview, Zuck’s point of open source is to avoid concentration of AI while he didn’t ignore the harmful consequences of open source saying that it’s our responsibility to do a good job of reducing harm. There are several benefits of open source, one is that people could figure out cheaper ways to develop models so it won’t cost too much resource. The other benefit is that they could enable more efficient developments and vertical use cases in a lot of different systems. Take Google and Apple for example, their mobile ecosystems restricted what developers could build or what features they could launch on them.

  2. For companies like Meta with well-established network effect, they really don’t need to have the best model. AI’s content creation potential benefits Meta’s platforms, even if the models are not exclusively theirs. This is the most reasonable reason from business perspective and was stated in an earnings call.

  3. By open-sourcing models, Meta started developer communities which can contribute to whatever the ecosystem Meta built and help solidify the advantage of it. Most recent example is the open model of Horizon OS which powers its VR headsets. It allows developers and creators to take advantage of these technologies to create MR experiences and grow business on it. Then Meta Quest Store can be quickly established.

  4. Models themselves are not a moats. Moats are built through data and habits. Open source eventually makes economic value of foundational model disintegrated. There will be no value in foundational model economically and there is probably less point for VC to plow billions of dollars into a foundational model development startup. The potential economic values are in 100K+ developers iteratively and quickly training and deploying the open source models for specific business use cases. Inference will be way more important than training. So attention and money will be less concentrated to products like OpenAI GPT series and Nvidia training GPU but more on Cloud platforms with inference GPU for personal and business usage.

I have read a lot of news and articles, and watched a few interviews regarding AI safety these days. The future of AI is promising. People are working towards more powerful AI and personalized AI assistants or agents. At the same time, AI causes problems. A very obvious downside of AI is people could use AI to do harmful things, but I think we are good to work together and prevent that from different angles such as legal aspect or open source software. I worry more about things that are not easily avoided and cannot be seen at least in these years when AI is still immature - which is being too early to rely on AI and thus deviating from truth. For example, it is too soon for us to lose faith in jobs like teachers, historians, journalists, writers, etc, but I’m concerned we are already losing faith in those jobs because of the development of AI and some violations of copyrighted work. As we have seen AI could have a wrong understanding of facts, have biased opinions, and make things up that don’t exist, we could fight for the truth but the dead cannot speak for themselves. It would be pathetic if humans lived in hallucinations in the future, and I don’t know if there’s any good practice to prevent it. It’s like Pandora’s box is opened and complications cannot be stopped. But we should at least think seriously about the potential impact of AI on society and human consciousness and possible unexpected consequences.

Articles

As Columbia Business School professor Rita McGrath points out, it’s about identifying “the strategic inflection points” at which the cost of waiting exceeds to cost acting — in other words, identifying the most strategic point to enter a market or adopt a technology, balancing the risks and opportunities based on market readiness, technological maturity and organizational capacity.

This speaks to the growing adoption of agile, “act, learn, build” approaches over traditional “prove-plan-execute” orientations. The popularity of techniques like discovery-driven planning, the lean startup, and other agile approaches and propagated this philosophy in which, rather than building bullet-proof business cases, one makes small steps, learning from them, and deciding whether to invest further.

― “6 Strategic Concepts That Set High-Performing Companies Apart”, Harvard Business Review [Article]

It’s a very good read. It provided real world business examples such as Nvidia’s partnership with ARM Holdings and Amazon’s Alexa offering for the strategic concept “borrow someone’s road”, Microsoft’s decision to make Office available on Apple’s iOS devices in 2014 and Microsoft’s partnership with Adobe, Salesforce, and Google for the strategic concept “Parter with a third party”, Deere & Co’s decision on openly investing in precision agriculture technologies for the strategic concept “reveal your strategy”, Mastercard’s “Beyond Cash” initiative in 2012 for the strategic concept “be good”, Ferrari’s strategic entry into the luxury SUV market for the strategic concept “let the competition go”, and Tesla’s modular approach to battery manufacturing for the strategic concept “adopt small scale attacks”.

Gig work is structured in a way that strengthens the alignment between customers and companies and deepens the divide between customers and workers, leading to systemic imbalances in its service triangle.

Bridging the customer-worker divide can result in higher customer trust and platform commitment, both by the customer and the worker.

To start, platforms need to increase transparency, reduce information asymmetry, and price their services clearly, allowing customers to better understand what they are paying for rather than only seeing an aggregated total at the end of the transaction. This, in turn, can help customers get used to the idea that if workers are to be paid fairly, gig work cannot be a free or low-cost service.

Gig workers might be working through an app, but they are not robots, and they deserve to be treated respectfully and thoughtfully. So tip well, rate appropriately, and work together to make the experience as smooth as possible both for yourself and for workers.

― “How Gig Work Pits Customers Against Workers”, Harvard Business Review [Article]

This is a good article for better understanding how gig work structured differently than other business model, and what the key points are for better business performance and triangle relationships.

TCP/IP unlocked new economic value by dramatically lowering the cost of connections. Similarly, blockchain could dramatically reduce the cost of transactions. It has the potential to become the system of record for all transactions. If that happens, the economy will once again undergo a radical shift, as new, blockchain-based sources of influence and control emerge.

“Smart contracts” may be the most transformative blockchain application at the moment. These automate payments and the transfer of currency or other assets as negotiated conditions are met. For example, a smart contract might send a payment to a supplier as soon as a shipment is delivered. A firm could signal via blockchain that a particular good has been receivedor the product could have GPS functionality, which would automatically log a location update that, in turn, triggered a payment. We’ve already seen a few early experiments with such self-executing contracts in the areas of venture funding, banking, and digital rights management.

The implications are fascinating. Firms are built on contracts, from incorporation to buyer-supplier relationships to employee relations. If contracts are automated, then what will happen to traditional firm structures, processes, and intermediaries like lawyers and accountants? And what about managers? Their roles would all radically change. Before we get too excited here, though, let’s remember that we are decades away from the widespread adoption of smart contracts. They cannot be effective, for instance, without institutional buy-in. A tremendous degree of coordination and clarity on how smart contracts are designed, verified, implemented, and enforced will be required. We believe the institutions responsible for those daunting tasks will take a long time to evolve, And the technology challenges especially security are daunting.

― “The Truth About Blockchain”, Harvard Business Review [Article]

This is the second Blockchain related article I have read from Harvard Business Review. Different authors have different perspectives. Unlike the previous article with a lot of concerns and cautions about Web3, this article seems more optimistic. It proposed a framework for adopting blockchain to revolutionize modern business, and a guidance to Blockchain investment. It points out that Blockchain has great potentials in boosting the efficiency and reducing the cost for all transactions and then explained the reason why the adoption of Blockchain would be slow by making a comparison with TCP/IP, which took more than 30 years to reshape the economy by dramatically lowering the cost of connections. This is an interesting comparison: e-mail enabled bilateral messaging as the first application of TCP/IP, while bitcoin enables bilateral financial transactions as the first application of Blockchain. It reminds me about what people (Jun Lei, Huateng Ma, Lei Ding, etc) were thinking about internet mindset and business model back in 2000s.

In the end, the authors proposed a four-quadrant framework for adopting Blockchain step by step. The four quadrants are created by two dimensions: novelty (equivalent to the amount of efforts required to ensure users understand the problem) and complexity (equivalent to the amount of coordination and collaboration required to produce values). With the increase of both dimensions, the adoption will require more institutional change. An example of “low novelty and low complexity” is simply adding bitcoin as an alternative transaction method. An example of “low novelty and high complexity” is building a new, fully formed cryptocurrency system which requires wide adoption from every monetary transaction party and consumers’ complete understanding of cryptocurrency. An example of “high novelty and low complexity” is building a local private network on which multiple organizations are connected via a distributed ledger. An example of “high novelty and high complexity” is building “smart contracts”.

News

Does Amazon’s cashless Just Walk Out technology rely on 1,000 workers in India? [Link]

Amazon insists Just Walk Out isn’t secretly run by workers watching you shop [Link]

An update on Amazon’s plans for Just Walk Out and checkout-free technology [Link]

It’s been reported that there are over 1000 Indian workers behind the cameras of Just Walk Out. It sounds dystopian and reminds me of “Snowpiercer” movie in 2013. In 2022, about 700 of every 1000 Just Walk Out sales had to be reviewed by Amazon’s team in India, according to The Information. Amazon spokesperson explained that the technology is made by AI (computer vision and deep learning) while it does rely on human moderators and data labelers. Amazon clarified that it’s not true that Just Walk Out relies on human reviewers. They said object detection and receipt generation are completely AI powered, so no human watching live videos. But human are responsible for labeling and annotation for data preparation, which also requires watching videos.

I guess the technology was not able to complete the task end-to-end by itself without supervision or it’s still on the developing stage? I believe it could be Amazon’s strategy to build and test Just Walk Out, Amazon Dash Cart, and Amazon One at the same time while improving AI system, since they are “just getting started”. As Amazon found out that customers prefer Dash Cart in large stores, it has already expanded Dash Cart to all Amazon Fresh stores as well as third-party grocers. And customers prefer Just Walk Out in small stores, so it’s available now in 140+ thrid-party locations. Customers love Amazon One’s security and convenience regardless the scale of stores, so it’s now available at 500+ Whole Foods Market stores, some Amazon stores, and 150+ third-party locations.

Data centres consume water directly to prevent information technology equipment from overheating. They also consume water indirectly from coal-powered electricity generation.

The report said that if 100 million users had a conversation with ChatGPT, the chatbot “would consume 50,000 cubic metres of water – the same as 20 Olympic-sized swimming pools – whereas the equivalent in Google searches would only consume one swimming pool”.

― China’s thirsty data centres, AI industy could use more water than size of South Korea’s population by 2030: report warns [Link]

The rapid growth of AI could dramatically increase demand on water resources. AI eats tokens, consumes compute, and drinks water.

15 Graphs That Explain the State of AI in 2024 [Link]

Stanford Institute for Human-Centered Artificial Intelligence (HAI) published 2024’s AI Index report [Link]. 502-page reading journey started.

ins_emails1
ins_emails2
ins_emails3
ins_emails4

― Leaked emails reveal why Mark Zuckerberg bought Instagram [Link]

Zuckerberg’s discussion of Instagram acquisition back in 2012 proved his corporate strategic foresights. He was aiming to buy the time and network effect, rather than simply neutralizing competitors or improving products. He bought Instagram for \(\$1\)B, today it is worth \(\$500\)B. It’s very impressive.

Introducing Meta Llama 3: The most capable openly available LLM to date [Link]

Llama 3: Scaling open LLMs to AGI [Link]

Meta released early versions of Llama 3. Pretrained and instruction-fine-tuned Llama3 with 8B and 70B parameters are now open-source. Its 405B version is still training.

Llama 3 introduces Grouped Query Attention (GQA), which reduces the computational complexity of processing large sequences by grouping attention queries. Llama 3 also had extensive pre-training involving over 15 trillion tokens, including a significant amount of content in different languages, enhancing its applicability across diverse linguistic contexts. Post-training techniques include finetuning and rejection sampling which refine the model’s ability to follow instructions and minimize error.

Cheaper, Better, Faster, Stronger - Continuing to push the frontier of AI and making it accessible to all. [Link]

Mistral AI’s Mixtral 8x22B has a Sparse Mixture-of-Experts (SMoE) architecture, which maximize efficiency by activating only 44B out of 176B parameters. The model’s architecture ensures that only the most relevant “experts” are activated during specific tasks. The experts are individual neural networks as apart of SMoE model. They are trained to become proficient at particular sub-tasks out of the overall task. Since only a few experts are engaged for any given input, this design reduces computational complexity.

GPT-4 Turbo and GPT-4 [Link]

GPT-4-Turbo has significantly enhanced its multimodal capabilities by incorporating AI vision technology. This model is able to analyze videos, images, and audios. Its tokenizer now has a larger 128000 token context window, which maximizes its memory.

The race to lead A.I. has become a desperate hunt for the digital data needed to advance the technology. To obtain that data, tech companies including OpenAI, Google and Meta have cut corners, ignored corporate policies and debated bending the law, according to an examination by The New York Times.

Tech companies are so hungry for new data that some are developing “synthetic” information. This is not organic data created by humans, but text, images and code that A.I. models produce — in other words, the systems learn from what they themselves generate.

― How Tech Giants Cut Corners to Harvest Data for A.I. [Link]

OpenAI developed a speech recognition tool ‘Whisper’ to transcribe the audio from YouTube videos, generating text data for AI system. Google employees know OpenAI had harvested YouTube videos for data but they didn’t stop OpenAI because Google had also used transcripts of YouTube videos for training AI models. Google’s rules about the legal usage of YouTube videos is vague and OpenAI’s employee were wading into a legal gray area.

As many tech companies such as Meta and OpenAI reached the stage of data shortage, OpenAI started to train AI models by using synthetic data synthesized by two different AI models, one produces the data, the other judges the information.

Grok-1.5 Vision Preview [Link]

Musk released the preview of first multimodal model Grok-1.5V. It is able to understand both textual and visual information. One unique feature is that it adopts Rust, JAX, and Kubernetes to construct its distributed training architecture.

One page of the Microsoft presentation highlights a variety of “common” federal uses for OpenAI, including for defense. One bullet point under “Advanced Computer Vision Training” reads: “Battle Management Systems: Using the DALL-E models to create images to train battle management systems.” Just as it sounds, a battle management system is a command-and-control software suite that provides military leaders with a situational overview of a combat scenario, allowing them to coordinate things like artillery fire, airstrike target identification, and troop movements. The reference to computer vision training suggests artificial images conjured by DALL-E could help Pentagon computers better “see” conditions on the battlefield, a particular boon for finding — and annihilating — targets.

OpenAI spokesperson Liz Bourgeous said OpenAI was not involved in the Microsoft pitch and that it had not sold any tools to the Department of Defense. “OpenAI’s policies prohibit the use of our tools to develop or use weapons, injure others or destroy property,” she wrote. “We were not involved in this presentation and have not had conversations with U.S. defense agencies regarding the hypothetical use cases it describes.”

Microsoft told The Intercept that if the Pentagon used DALL-E or any other OpenAI tool through a contract with Microsoft, it would be subject to the usage policies of the latter company. Still, any use of OpenAI technology to help the Pentagon more effectively kill and destroy would be a dramatic turnaround for the company, which describes its mission as developing safety-focused artificial intelligence that can benefit all of humanity.

― Microsoft Pitched OpenAI’s DALL-E as Battlefield Tool for U.S. Military [Link]

Other than what has mentioned in the news, by cooperating with Department of Defense, AI can understand how human battle and defense, which is hard to learn from current textual and visual information from the internet. So it’s possible that this is the first step of AI troop.

Microsoft scientists developed what they call a qubit virtualization system. This combines quantum error-correction techniques with strategies to determine which errors need to be fixed and the best way to fix them.

The company also developed a way to diagnose and correct qubit errors without disrupting them, a technique it calls “active syndrome extraction.” The act of measuring a quantum state such as superposition typically destroys it. To avoid this, active syndrome extraction instead learns details about the qubits that are related to noise, as opposed to their quantum states, Svore explains. The ability to account for this noise can permit longer and more complex quantum computations to proceed without failure, all without destroying the logical qubits.

― Microsoft Tests New Path to Reliable Quantum Computers 1,000 physical qubits for each logical one? Try a dozen, says Redmond [Link]

Think about it in the sense of another broad, diverse category like cars. When they were first invented, you just bought “a car.” Then a little later, you could choose between a big car, a small car, and a tractor. Nowadays, there are hundreds of cars released every year, but you probably don’t need to be aware of even one in ten of them, because nine out of ten are not a car you need or even a car as you understand the term. Similarly, we’re moving from the big/small/tractor era of AI toward the proliferation era, and even AI specialists can’t keep up with and test all the models coming out.

― Too Many Models [Link]

This week, the speed of releasing LLMs becomes about 10 per week. This article provides good explanation about why we don’t need to keep up with it or test all released models. Car is a good analogy to AI model nowadays. There are all kinds of brands and sizes, and designed for different purposes. Hundreds of cars are released every year, but you don’t need to know them. Majority of the models are not groundbreaking but whenever there is big step, you will be aware of it.

Although not necessary to catch up all the news, we at least need to be aware of the main future model features - where modern and future LLMs are heading to: 1) multimodality 2) recall capability 3) reasoning.

ByteDance Exploring Scenarios for Selling TikTok Without Algorithm [Link]

ByteDance is internally exploring scenarios for selling TikTok’s US business to non-tech industry without the algorithm if they exhausted all legal options to fight legislation of the ban. Can’t imagine who without car expertise is going to buy a car without engine.

Developers and creators can take advantage of all these technologies to create mixed reality experiences. And they can reach their audiences and grow their businesses through the content discovery and monetization platforms built into Meta Horizon OS, including the Meta Quest Store, which we’ll rename the Meta Horizon Store.

Introducing Our Open Mixed Reality Ecosystem [Link]

Everyone knows how smart Zuck is in the idea of open-source.

Other news:

Elon Musk says Tesla will reveal its robotaxi on August 8th [Link]

SpaceX launches Starlink satellites on record 20th reflight of a Falcon 9 rocket first stage [Link]

Deploy your Chatbot on Databricks AI with RAG, DBRX Instruct, Vector Search & Databricks Foundation Models [Link]

Adobe’s ‘Ethical’ Firefly AI Was Trained on Midjourney Images [Link]

Exclusive: Microsoft’s OpenAI partnership could face EU antitrust probe, sources say [Link]

Meta AI adds Google Search results [Link]

Our next-generation Meta Training and Inference Accelerator [Link]

Meta’s new AI chips run faster than before [Link]

Anthropic-cookbook: a collection of notebooks / recipes showcasing some fun and effective ways of using Claude [Link]

Amazon deploys 750,000+ robots to unlock AI opportunities [Link]

Apple’s four new open-source models could help make future AI more accurate [Link]

The Mystery of ‘Jia Tan,’ the XZ Backdoor Mastermind [Link]

YouTube

If someone whom you don’t trust or an adversary gets something more powerful, then I think that that could be an issue. Probably the best way to mitigate that is to have good open source AI that becomes the standard and in a lot of ways can become the leader. It just ensures that it’s a much more even and balanced playing field.

― Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters [Link]

What I learned from this interview: The future of Meta AI would be a kind of AI general assistant product where you give it complicated tasks and then it goes away and does them. Meta will probably build bigger clusters. No one has built 1GW data center yet but building it could just be a matter of time. Open source can be both bad and good. People can use LLM to do harmful things, while what Mask worries more about is the concentration of AI, where there is an untrustworthy actor having the super strong AI. Open source software can make AI not getting stuck in one company but can be broadly deployed to a lot of different systems. People can set standards on how it works and AI can get checked and upgraded together.

It is clear that inference was going to be a scaled problem. Everyone else had been looking at inference as you take one chip, you run a model on it, it runs whatever. But what happened with AlphaGo was we ported the software over, and even though we had 170 GPUs vs 48 TPUs, the 48 TPUs won 99 out of 100 games with the exact same software. What that meant was compute was going to result in better performance. And so the insight was - let’s build scaled inference.

(Nvidia) They have the ecosystem. It’s a double-sided market. If they have a kernel-based approach they already won. There’s no catching up. The other way that they are very good is vertical integration and forward integration. What happens is Nvidia over and over again decides that they want to move up the stack, and whatever the customers are doing, they start doing it.

Nvidia is incredible at training. And I think the design decision that they made including things like HBM, were really oriented around the world back then, which was everything is about training. There weren’t any real world application. None of you guys were really building anything in the wild where you needed super fast inference.

What we saw over and over again was you would spend 100% of your compute on training, you would get something that would work well enough to go into production, and then it would flip to about 5%-10% training and 90%-95% inference. But the amount of training would stay the same, the inference would grow massively. And so every time we would have a success at Google, all of a sudden, we would have a disaster, we called it the success disaster, where we can’t afford to get enough compute for inference.

HBM is this High Bandwidth Memory which is required to get performance, because the speed at which you can run these applications depends on how quickly you can read that into memory. There’s a finite supply, it’s only for data centers, so they can’t reach into the supply for mobile or other things, like you can with other parts. Also Nvidia is the largest buyer of super caps in the world and all sorts of other components. The 400 gigabit cables, they’ve bought them all out. So if you want to compete, it doesn’t matter how good of a product you design, they’ve bought out the entire supply chain for years.

The biggest difference between training and inference is when you are training, the number of tokens that you are training on is measured in month, like how many tokens can we train on this month. In inference, what matters is how many tokens you can generate per millisecond or a couple milliseconds.

It’s fair to say that Nvida is the exemplar in training but really isn’t yet the equivalent scaled winner in inference.

In order to get the latency down, we had to design a completely new chip architecture, we had to design a completely new networking architecture, an entirely new system, an entirely new runtime, an entirely new compiler, and entirely new orchestration layer. We had to throw everything away and it had to be compatible with PyTorch and what other people actually developing in.

I think Facebook announced that by the end of this year, they are going to have the equivalent of 650000 H100s. By the end of this year, Grok will have deployed 100000 of our LPUs which do outperform the H100s on a throughput and on a latency basis. So we will probably get pretty close to the equivalent of Meta ourselves. By the end of next year, we are going to deploy 1.5M LPUs, for comparison, last year Nvidia deployed a total of 500000 H100s. So 1.5M means Grok will probably have more inference GenAI capacity than all of the hyperscalers and clouds service providers combined. So probably about 50% of the inference compute in the world.

I get asked a lot should we be afraid of AI and my answer to that is, if you think back to Galileo, someone who got in a lot of trouble. The reason he got in trouble was he invented the telescope, popularized it, and made some claims that we were much smaller than everyone wanted to believe. The better the telescope got the more obvious it became that we were small. In a large sense, LLMs are the telescope for the mind, it’s become clear that intelligence is larger than we are and it makes us feel really really small and it’s scary. But what happened over time was as we realized the universe was larger than we thought and we got used to that, we started to realize how beautiful it was and our place in the universe. And I think that’s what’s going to happen. We’re going to realize intelligence is more vast than we ever imagined. And we are going to understand our place in it, and we are not going to be afraid of it.

― Conversation with Groq CEO Jonathan Ross [Link]

This is a very insightful conversation especially in the part of comparison of training and inference. The answer to the final question is fascinating to end the conversation. A great takeaway that “intelligence is a telescope for the mind, in that we realize that we are small, while then also opportunity to see intelligence is vast and to not be afraid of it.”.

Meta Announces Llama 3 at Weights & Biases’ Conference - Weights & Biases [Link]

Economic value is getting disintegrated, there no value in foundational models economically. So then the question is who can build on top of them the fastest. Llama was announced last Thursday, 14 hours later Groq actually had that model deployed in the Groq Cloud, so that 100K+ developers could start building on it. That’s why that model is so popular. It puts the closed models on their heels. Because if you can’t both train and deploy iteratively and quickly enough, these open source alternatives will win, and as a result the economic potential that you have to monetize those models will not be there. - Chamath Palihapitiya

By open-sourcing these models they limit competition because VCs are no longer going to plow half a billion dollars into a foundational model development company, so you limit the commercial interest and the commercial value of foundational models. - David Friedberg

AI is really two markets - training and inference. And inference is going to be 100 times bigger than training. And Nvidia is really good at training and very miscast at inference. The problem is that right now we need to see a capex build cycle for inference, and there are so many cheap and effective solutions, Groq being one of them but there are many others. And I think why the market reacted very negatively was that it did not seem that Facebook understood that distinction, that they were way overspending and trying to allocate a bunch of GPU capacity towards inference that didn’t make sense. - Chamath Palihapitiya

You want to find real durable moats not these like legal arrangement that try to protect your business through these types of contracts. One of the reasons why the industry moves so fast is best practices get shared very quickly, and one of the ways that happens is that everybody is moving around to different companies (average term of employment is 18-36 months). There are people who violate those rules (taking code to the new company etc), and that is definitely breaking the rules, but you are allowed to take with you anything in your head, and it is one of the ways that best practices sort of become more common. - David Sacks

Meta’s scorched earth approach to AI, Tesla’s future, TikTok bill, FTC bans noncompetes, wealth tax - All-In Podcast [Link]

Are LLMs Hitting A Wall, Microsoft & Alphabet Save The Market, TikTok Ban - Big Technology Podcast [Link]

Papers and Reports

ReALM: Reference Resolution As Language Modeling Link]

Apple proposed the ReALM model with 80M, 250M, 1B, and 3B parameters. It can be used on mobile devices and laptops. The task of ReALM is “Given relevant entities and a task the user wants to perform, we wish to extract the entity (or entities) that are pertinent to the current user query. “. The relevant entities can be on-screen entities, conversational entities, and background entities. The analysis shows that ReALM beats MARRs and has similar performance with GPT-4.

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models [Link]

CodeGemma: Open Code Models Based on Gemma [Link]

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention [Link]

Google introduced the next generation transformer - infini-transformer. It’s able to take infinite length of input without the requirement of more memory or computation. Unlike vanilla attention mechanism in traditional transformer which reset their attention memory after each context window to manage new data, infini-attention retains a compressive memory and builds in both masked local attention and long-term linear attention mechanisms. The model compresses and reuses key-value states across all segments, allowing it to pull relevant information from any part of the document.

AI agents are starting to transcend their digital origins and enter the physical world through devices like smartphones, smart glasses, and robots. These technologies are typically used by individuals who are not AI experts. To effectively assist them, Embodied AI (EAI) agents must possess a natural language interface and a type of “common sense” rooted in human-like perception and understanding of the world.

OpenEQA: Embodies Question Answering in the Era of Foundation Models [Link] [Link]

The OpenEQA introduced by Meta is the first open vocab benchmark dataset for the formulation of Embodied Question Answering (EQA) task of understanding environment either by memory or by active exploration, well enough to answer questions in natural language. Meta also provided an automatic LLM-powered evaluation protocol to evaluate the performance of SOTA models like GPT-4V and see whether it’s close to human-level performance.

OpenEQA looks like the very first step towards “world model” and I’m excited that it’s coming. The dataset contains over 1600 high-quality human generated questions drawn from over 180 real-world environments. If the future AI agent can answer N questions over N real-world environments, where N is approximately infinity, we can call it God intelligence. But we are probably not able to achieve that “world model” at least with my limited imagination, because it requires un-infinite compute resources and there can be ethical issues. However, if we take one step back, instead of creating “world model”, a “human society model” or “transformation model”, etc, sounds more possible. Limiting question to a specific pain point problem and limiting environment according to it would both save resources and contribute AI’s value to human society.

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework [Link] [Link]

OpenELM is a small language model (SLM) tailored for on-device applications. The models range from 270M to 3B parameters, which are suitable for deployment on mobile devices and PCs. The key innovation is called “layer-wise scaling architecture”. It allocates fewer parameters to the initial transformer layers and gradually increases the number of parameters towards the final layers. This approach optimizes compute resources while remaining high accuracy. Inference of OpenELM can be run on Intel i9 workstation with RTX 4090 GPU and an M2 Max MacBook Pro.

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone [Link] [Link]

Microsoft launched Phi-3 family including mini (3.8B), small (7B), and medium (14B). These models are designed to run efficiently on both mobile devices and PCs. All models leverage a transformer decoder architecture. The performance is comparable to larger models such as Mixtral 8x7B and GPT3.5. It supports a default of 4K context length but is expandable to 128K through LongRope technology. The models are trained on web data and synthetic data, using two-phase approach which enhances both general knowledge and specialized skills (e.g. logical reasoning), and fine tuned in specific domains. Mini (3.8B) is especially optimized for mobile usage, requiring 1.8GB memory when compressed to 4-bits and processing 12+ tokens per second on mobile devices such as iPhone 14.

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time [Link] [Link]

2024 Generative AI Prediction Report from CB insights [Link]

Stable Diffusion 3 [Link]

Stable Diffusion 3 API now available as Stable Assistant effort looms [Link]

Stable Diffusion 3: Research Paper [Link]

Substack

You don’t get paid for working hard.

You get paid based on how hard you are to replace.

You get paid based on how much value you deliver.

Focus on being able to produce value and money will follow.

― Andrew Lokenauth

What he’s saying is so true - Don’t work so hard and end up losing yourself.

There is a popular saying on Wall Street. While IPO means Initial Public Offering, it also means “It’s Probably Overpriced” (coined by Ken Fisher).

I don’t invest in brand-new IPOs during the first six months. Why? Shares tend to underperform out of the gate for new public companies and often bottom around the tail end of the lock-up period, with anticipation of selling pressure from insiders. It’s also critical to gain insights from the first few quarters to form an opinion about the management team.

  • Do they forecast conservatively?
  • Do they consistently beat their guidance?

If not, it might be a sign that they are running out of steam and may have embellished their prospects in the S-1. But we need several quarters to understand the dynamic at play.

― Rubrik IPO: Key Takeaways - App Economy Insights [Link]

An analysis of Rubrik, a Microsoft-backed cybersecurity company going public. I’ve got some opinions from the author in terms of company performance and strategic investment.

Intel Unleashes Enterprise AI with Gaudi 3 - AI Supremacy [Link]

Intel is a huge beneficiary of Biden’s CHIPS Act. In late March 2024, Intel will receive up to \(\$8.5\) billion in grants and \(\$11\) billion in loans from the US government to produce cutting-edge semiconductors.

US Banks: Uncertain Year - App Economy Insights [Link]

Formula 1’s recent surge in popularity and revenue isn’t simply a product of fast cars and daring drivers. The Netflix docuseries Drive to Survive, which premiered in March 2019 and is already in its sixth season, has played a transformative role in igniting global interest and fueling unprecedented growth for the sport.

The docuseries effectively humanized the sport, attracting new fans drawn to the high-stakes competition, team rivalries, and compelling personal narratives.

― Formula 1 Economics - App Economy Insights [Link]

Netflix Engagement Machine - App Economy Insights [Link]

Recent business highlights: 1) focus on drama and storylines around sports, 2) subscribers can download exclusive games on the App Store for free, since Nov 2021, and Netflix is exploring game monetization through in-app purchases or ads, 3) for Premium Subscription Video on Demand, churn plummets YoY, 4) the \(\$6.99\)/month ad-supported plan was launched in Nov 2023, memberships grew 65% QoQ and monetization is still lagging, 5) started limiting password sharing within one household.

Boeing vs Airbus - App Economy Insights [Link]

Boeing 737 MAX’s two fatal crashes due to faulty software have eroded public trust. In addition to quality issues, Boeing is facing severe production delays. Airbus on the other hand has captured significant market share from Boeing. Airbus is heavily investing in technologies such as hydrogen-powered aircraft and sustainable aviation fuels. Airbus is also investing in the A321XLR and potential new widebody aircraft.

We disagree on what open-source AI should mean - Interconnects [Link]

This is a general trend we have observed a couple of years ago. We called is Mosaic’s Law where a model of a certain capability will require 1/4th the dollars every year from hw/sw/algo advances. This means something that is \(\$100\)m today -> \(\$25\)m next year -> \(\$6\)m in 2 yrs -> \(\$1.5\)m in 3 yrs. ― Naveen Rao on X [Link]

DBRX: The new best open model and Databricks’ ML strategy - Interconnects [Link]

In the test of refusals, it shows that the inference system seems to contain an added filtering in the loop to refuse illegal requests.

Llama 3: Scaling open LLMs to AGI - Interconnects [Link]

Tesla: Robotaxi Pivot - App Economy Insights [Link]

Q1 FY24 is bad and probably the worst. This means Tesla is going to get better in the rest of the year. It sounds that Elon is more clear and focused on his plan. And promises are met though there are some delays [Master Plan, Part Deux].

Recent business highlights: 1) cancelling Model 2 and focusing on Robotaxis and next-gen platform (Redwood), 2) laying off 10%+, 3) FSD price cuts and EV (Model 3 and Model Y) price cuts, 4) Recall Cybertruck due to safety issues, 5) North American Charging Standard (NACS) is increasingly adopted by major automakers, 6) reached ~1.2B miles driven by FSD beta, 7) energy storage deployment increased sequentially.

What is competitive in the market: 1) competitive pressure from BYD, 2) OpenAI’s Figure 01 robot and Boston Dynamics’s next-gen Atlas are competing with Optimus.

With its open-source AI model Llama, Meta learned that the company doesn’t have to have the best models — but they need a lot of them. The content creation potential benefits Meta’s platforms, even if the models aren’t exclusively theirs.

Like Google with Android, Meta aims to build a platform to avoid being at the mercy of Apple or Google’s ecosystems. It’s a defensive strategy to protect their advertising business. The shift to a new vision-based computing experience is an opportunity to do so.

Meta has a head start in the VR developer community compared to Apple. A more open app model could solidify this advantage.

By now, Meta has a clear playbook for new products:

  1. Release an early version to a limited audience.
  2. Gather feedback and start improving it.
  3. Make it available to more people.
  4. Scale and refine.
  5. Monetize.

He also shared some interesting nuggets:

  • Roughly 30% of Facebook posts are AI-recommended.
  • Over 50% of Instagram content is AI-recommended.

― Meta: The Anti-Apple - App Economy Insights [Link]

Recent business highlights: 1) announced an open model for Horizon OS - which powers its VR headsets, 2) Meta AI is now powered by Llama 3, 3) whether not TikTok will still exist in US does not matter since the algorithm will not be sold, then it will benefit any competitor company such as Meta.

Books

Humans do inspiration; machines do validation.

Math is good at optimizing a known system; humans are good at finding a new one. Put another way, change favors local maxima; innovation favors global disruption.

― “Lean Analytics, Use Data to Build a Better Startup Faster”

Sometimes people who are doing hard data work may forget to step back and look at the big picture. This is a common mistake because we can definitely go from scientific data analysis to actionable insight for making better business decision. But we need to have some additional thoughts about whether the decision is a global optima or it’s just local due to the limited sample, restricted project goal, or restricted team scope.

Articles

When I decided to end my voluntary immersion in the driver community, I could not shake the feeling that the depersonalization of app workers is a feature, not a bug, of an economic model born of and emboldened by transformations that are underway across the global economy. This includes increasingly prevalent work arrangements characterized by weak employer-worker relations (independent contracting), strong reliance on technology (algorithmic management, platform-mediated communication), and social isolation (no coworkers and limited customer interactions).

As forces continue to erode traditional forms of identity support, meaningful selfdefinition at work will increasingly rely on how we collectively use and misuse innovative technologies and business models.
For example, how can companies deploy algorithmic management in a way that doesn’t threaten and depersonalize workers? How can focusing on the narratives that underlie and animate identities help workers reimagine what they really want and deserve out of a career coming out of the pandemic and the Great Resignation? Will increasingly immersive and realistic digital environments like the metaverse function as identity playgrounds for workers in the future? How will Web3 broadly, and the emergence of novel forms of organizing specifically (e.g., decentralized autonomous organizations or DAOs), affect the careers, connections, and causes that are so important to workers? What role can social media platforms, online discussion forums, and other types of virtual water coolers play in helping independent workers craft and sustain a desirable work identity? In short, how can we retain the human element in the face of increasingly shrewd resource management tactics?”

― “Dehumanization is a Feature of Gig Work, Not a Bug”, Harvard Business Review, The Year in Tech 2024 [Article]

This reminds me that last year when I was on vacation in LA, I’ve talked to a driver worked for both Lyft and Uber in LA. He complained Lyft’s route recommendation algorithm is shitty, not helpful at all, a waste of time, while Uber is better in comparison. At that time I realized how important it is to strengthen employer worker relations and gather feedback from workers or clients on the product improvement. This is a great article where the author raises his concerns of dehumanization of workers in the future. While technology is advancing and economy is transforming, we don’t expect people to forget who they are in their daily basis work.

Bringing a new technology to market presents a chicken-or-egg problem: The product needs a supply of complementary offerings, but the suppliers and complementors don’t exist yet, and no entrepreneur wants to throw their lot in with a technology that isn’t on the market yet.

There are two ways of “solving” this problem. First, you can time the market, and wait until the ecosystem matures— though you risk waiting a long time. Second, you can drive the market, or supply all the necessary inputs and complements yourself.

― “Does Elon Musk Have a Strategy?”, Harvard Business Review, The Year in Tech 2024 [Article]

Here are two examples. To drive the market, Musk supplies both electric vehicles and charging stations. Zuckerberg proposed the concept of metaverse and changed his company’s name.

This is where Musk’s Wall Street critics might say he’s weakest. Many of his businesses don’t articulate a clear logic, which is demonstrated by the unpredictable way these businesses ultimately reach solutions or products.

Musk has spelled out some of his prior logic in a set of “Master Plans,” but most of the logical basis for exactly how he will succeed remains ambiguous. But this isn’t necessarily Musk’s fault or due to any negligence per se: When pursuing new technologies, particularly ones that open up a new market, there is no one who can anticipate the full set of possibilities of what that technology will be able to do (and what it will not be able to do).

― “Does Elon Musk Have a Strategy?”, Harvard Business Review, The Year in Tech 2024 [Article]

Elon is interesting, but I have to say that we human need this type of person to leap to the future.

What could he possibly want with Twitter? The thing is, over the last decade, the technological landscape has changed, and how and when to moderate speech has become a critical problem-and an existential problem for social media companies. In other words, moderating speech has looked more and more like the kind of big, complex strategic problem that captures Musk’s interest.

― “Does Elon Musk Have a Strategy?”, Harvard Business Review, The Year in Tech 2024 [Article]

This is a great article which profiles Elon Musk specifically in his strategies and vision. It answered my question confusing me for two years: what was Musk thinking on buying Twitter? The answer is: Musk’s vision is not in pursuit of a specific type of solution but is in pursuit of a specific type of problem. If we go back to 2016, Igor as a CEO of Disney decided not to buy Twitter because he looked at Twitter as the solution: a global distribution platform, while concerned the quality of speech on it is a problem. Musk was looking for challenges and complexities while Igor was preventing them and looking for solutions.

YouTube

“One of the things that I think OpenAI is doing that is the most important of everything that we are doing is putting powerful technology in the hands of people for free as a public good. We don’t run ads on our free version. We don’t monetize it in other ways. I think that kind of ‘open’ is very important, and is a huge deal for how we fulfill the mission. “― Sam

“For active learning, the thing is it truly needs a problem. It needs a problem that requires it. It is very hard to do research about the capability of active learning if you don’t have a task. You will come up with an artificial task, get good results, but not really convince anyone. “, “Active learning will actually arrive with the problem that requires it to pop up.”― Ilya

“To build an AGI, I think it’s going to be Deep Learning plus some ideas, and self-play would be one of these ideas. Self-play has such properties that can surprise us in truly novel ways. Almost all self-play systems produce surprising behaviors that we didn’t expect. They are creating solutions to problems.”, “Not just random surprise but to find the surprising solution to a problem.”― Ilya

“Transferring from simulation to the real world is definitely possible and it’s been exhibited many times by many different groups. It’s been especially successful in vision. Also OpenAI in the summer has demonstrated robot hand which was trained entirely in simulation. “, “The policy that it learned in simulation was trained to be very adaptive. So adaptive that when you transfer if could very quickly adapt to the physical world.”― Ilya

“The real world that I would imagine is one where humanity are like the board members of a company where the AGI is the CEO. The picture I would imagine is you have some kind of different entities, countries or cities, and the people who live there vote for what the AGI that represents them should do. You could have multiple AGIs, you would have an AGI for a city, for a country, and it would be trying to in effects take the democratic process to the next level.” “(And the board can always fire the CEO), press the reset button, re-randomize the parameters.”― Ilya

“It’s definitely possible to build AI system which will want to be controlled by their humans.”, “It will be possible to program an AGI to design it in such a way that it will have a similar deep drive that it will be delighted to fulfill, and the drive will be to help humans flourish.”― Ilya

“I don’t know if most people are good. I think that when it really counts, people can be better than we think.”― Ilya

Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI | Lex Fridman Podcast [Sam]

Ilya Sutskever: Deep Learning | Lex Fridman Podcast [Ilya]

I watched Lex’s interview with Sam Altman uploaded on March 18, 2024, and an older interview with Ilya Sutskever happened 3 years ago. Elon’s lawsuit against OpenAI frustrated Sam but Sam is optimistic about the future and everything he is going to release in the next few months. Sam answered the questions “what does open mean in OpenAI” that ‘open’ mainly means putting powerful tech in the hands of people for free as a public good, but not necessarily mean open-source. He said there can be open-source models or closed-source models. About the transition between non-profit to capped for-profit, Sam said OpenAI is not setting a precedent for startup to mimic it but he suggested most startups should go for for-profit directly if they pursue profitability in the beginning.

Ilya’s interview is more interesting to me because he talked a lot about vision, tech, philosophy in Deep Learning. It’s impressive that he had such thoughts 3 years ago.

News

Speech is one kind of liability for companies using generative AI. The design of these systems can create other kinds of harms—by, say, introducing bias in hiring, giving bad advice, or simply making up information that might lead to financial damages for a person who trusts these systems.

Because AIs can be used in so many ways, in so many industries, it may take time to understand what their harms are in a variety of contexts, and how best to regulate them, says Schultz.

― The AI Industry Is Steaming Toward A Legal Iceberg [Link]

The Section 230 of Communications Decency Act of 1996 has protected internet platforms from being held liable for the things we say on them, but it doesn’t cover speech that a company’s AI generates. It’s likely that in the future companies use AI will be liable for whatever it does. It could be a driver of pushing companies to take effort to avoid problematic AI output, and reduce “hallucinations” (when GenAI makes stuff up).

A very obvious downside is people could use AI to do harmful things, but I think people are good to work together and prevent that from different angles such as legal aspect or open source software. I worry more about things that are potential or cannot be seen at least in these years when AI is still immature - which is being too early to rely on AI and deviating from truth. For example, it is too soon for people to lose faith in jobs like teachers, historians, journalists, writers, etc, but I’m concerning people are already losing faith in those jobs because of the development of AI and some violations of copyrighted work. As we have seen that AI could have wrong understanding of facts, have biased opinions, and make things up that don’t exist, we lives could fight for the truth but the dead cannot talk.

China’s latest EV is a ‘connected’ car from smart phone and electronics maker Xiaomi [Link]

Xiaomi started EV manufacturing since 2021 and launched its first EV “SU7” on March 28th 2024. It has the following reasons of success: 1) efficient technology manufacturing in a large scale. Though Xiaomi has no experience in auto field, it is a supply chain master, and has perfect partnership with various suppliers. 2) affordable price. SU7’s start price is 215900 yuan while Tesla’a model 3 is 245900 yuan. 3) customer experience oriented innovation. SU7 model can link to over 1000 Xiaomi devices as well as Apple’s devices. In addition, Xiaomi aims to connect its cars with its phones and home appliances in a “Human x Car x Home” ecosystem.

“The world is just now realizing how important high-speed inference is to generative Al,” said Madra. “At Groa, we’re giving developers the speed, low latency, and efficiency they need to deliver on the generative Al promise. I have been a big fan of Groq since I first met Jonathan in 2016 and I am thrilled to join him and the Groq team in their quest to bring the fastest inference engine to the world.”

“Separating GroqCloud and Groq Systems into two business units will enable Groq to continue to innovate at a rapid clip, accelerate inference, and lead the Al chip race, while the legacy providers and other big names in Al are still trying to build a chip that can compete with our LPU,” added Ross.

― Groq® Acquires Definitive Intelligence to Launch GroqCloud [Link]

Al chip startup Groq acquired Definitive Intelligence to launch GroqCloud business unit led by Definitive Intelligence’s CEO Sunny Madra. Groq is also forming a Groq Systems business unit by infusing engineering resources from Definitive Intelligence, which aims to greatly expanding its customer and developer ecosystem.

Groq’s founder Janathan Ross is the inventor of the Google Tensor Processing Unit (TPU), Google’s custom Al accelerator chip used to run models. Groq is creating a Language Processing Unit (LPU) inference engine, which is claimed to be able to run LLM at 10x speed. Now GroqCloud provides customers the Groq LPU inference engine via the self-serve playground.

The House voted to advance a bill that could get TikTok banned in the U.S. on Wednesday. In a 352-65 vote, representatives passed the bipartisan bill that would force ByteDance, the parent company of TikTok, to either sell the video-sharing platform or prohibit it from becoming available in the U.S.

― What to Know About the Bill That Could Get TikTok Banned in the U.S. [Link]

TikTok is considered as critical threats to US national security because it is owned by ByteDance and required to collaborate with the Chinese Communist Party (CCP). If the bill is passed then ByteDance has to either sell the platform within 180 days or face a ban. TikTok informed users that Congress is planning a total ban of TikTok and encouraged users to speak out against the ban. Shou Zi Chew said the ban would put more than 300000 American jobs at risk.

San Francisco-based Anthropic introduced three new AI models — Claude 3 Opus, Sonnet and Haiku. The literary names hint at the capabilities of each model, with Opus being the most powerful and Haiku the lightest and quickest. Opus and Sonnet are available to developers now, while Haiku will arrive in the coming weeks, the company said on Monday.

― AI Startup Anthropic Launches New Models for Chatbot Claude [Link]

Waymo’s progress in California comes after General Motors-owned Cruise and Apple bowed out of the autonomous vehicle business in California, while Elon Musk’s Tesla has yet to develop an autonomous vehicle that can safely operate without a human driver at the controls.

― Waymo approved by regulator to expand robotaxi service in Los Angeles, San Francisco Peninsula [Link]

Elon Musk requires “FSD” demo for every prospective Tesla buyer in North America [Link]

Full Self Driving era seems to start, but Tesla’s FSD system does not turn cars into autonomous vehicles, so drivers still need to be attentive to the road and ready to steer or brake at any time while using FSD or FSD Beta. Will FSD help with Tesla’s stock?

SpaceX Starship disintegrates after completing most of third test flight [Link]

SpaceX’s Starship rocket successfully completed a repeat of stage separation during initial ascent, open and close its payload door in orbit, the transfer of super-cooled rocket propellant from one tank to another during spaceflight. But it skipped Raptor engine re-ignition test, failed re-entry to the atmosphere, and flying the rocked back to Earth. Overall, completion of many of the objectives represented progress in the development of spacecraft for the business and SpaceX and NASA’s moon program.

Open Release of Grok-1 [Link]

Musk founded xAI in March 2023 aiming to “understand the true nature of the universe”. It released the weights and network architecture of 314B Grok-1 on March 17, 2024. It’s under the Apache 2.0 license meaning it allows for commercial use. The model can be found in Github.

GB200 has a somewhat more modest seven times the performance of an H100, and Nvidia says it offers four times the training speed.

Nvidia is counting on companies to buy large quantities of these GPUs, of course, and is packaging them in larger designs, like the GB200 NVL72, which plugs 36 CPUs and 72 GPUs into a single liquid-cooled rack for a total of 720 petaflops of AI training performance or 1,440 petaflops (aka 1.4 exaflops) of inference. It has nearly two miles of cables inside, with 5,000 individual cables.

And of course, Nvidia is happy to offer companies the rest of the solution, too. Here’s the DGX Superpod for DGX GB200, which combines eight systems in one for a total of 288 CPUs, 576 GPUs, 240TB of memory, and 11.5 exaflops of FP4 computing.

Nvidia says its systems can scale to tens of thousands of the GB200 superchips, connected together with 800Gbps networking with its new Quantum-X800 InfiniBand (for up to 144 connections) or Spectrum-X800 ethernet (for up to 64 connections).

― Nvidia reveals Blackwell B200 GPU, the ‘world’s most powerful chip’ for AI [Link] [keynote]

Two B200 GPUs combined with one Grace CPU is a GB200 Blackwell Superchip. Two GB200 superchip is one Blackwell compute node. 18 Blackwell compute notes contain 36 CPU + 72 GPUs, becoming one larger virtual GPU - GB200 NVL72.

Nvidia also offers packages for companies such as DGX Superpod for DGX GB200 which combines 8 such GB200 NVL72. 8 GB200 NVL72 combined with xx becomes one GB200 NVL72 compute rack. And the AI factory or full data center in the future would consists about 56 GB200 NVL72 compute racks, which is in total around 32000 GPUs.

The Blackwell superchip will be 4 times faster and 25 times energy efficient than H100.

OpenAI is expected to release a ‘materially better’ GPT-5 for its chatbot mid-year, sources say[Link]

On March 14 (local time), during a meeting with the Korean Silicon Valley correspondent group, CEO Altman mentioned, “I am not sure when GPT-5 will be released, but it will make significant progress as a model taking a leap forward in advanced reasoning capabilities. There are many questions about whether there are any limits to GPT, but I can confidently say ‘no’.” He expressed confidence that if sufficient computing resources are invested, building AGI that surpasses human capabilities is entirely feasible.

Other news:

Elon Musk sues OpenAI for abandoning its mission to benefit humanity [Link]

A major AT&T data leak posted to the dark web included passcodes, Social Security numbers [Link]

Apple accused of monopolizing smartphone markets in US antitrust lawsuit [Link]

Amazon Invests $2.75 Billion in AI Startup Anthropic [Link]

Adam Neumann looks to buy back WeWork for more than $500M: sources [Link]

NVIDIA Powers Japan’s ABCI-Q Supercomputer for Quantum Research [Link]

Lilac Joins Databricks to Simplify Unstructured Data Evaluation for Generative AI [Link]

Papers and Reports

Scaling Instructable Agents Across Many Simulated Worlds [Link]

Google DeepMind SIMA Team is working on the Scalable, Instructable, Multiworld Agent (SIMA) project. The goal is to develop an agent that follows instructions to complete tasks in any 3D environments. So far they are making progress on making AI agent understand the environment from computer screen, and use keyboard-and-mouse controls to interact with environment, follow language instructions, and play the video game to maximize the win-rate.

OpenAI has similar work called OpenAI Universe, which aims to train and validate AI agent on performing real world tasks. They started from video game environment as well. Although the goals of these two project sound similar, the minor difference is that OpenAI Universe intended to develop a platform where AI is able to interact with games, websites, and applications, while SIMA aims to develop an AI agent or maybe a robot to interact with the real world.

Announcing HyperGAI: a New Chapter in Multimodal Gen AI [Link]

Introducing HPT: A Groundbreaking Family of Leading Multimodal LLMs [Link]

The startup HyperGAI aims to develop models for multimodal understanding and multimodal generation. They released HPT air and HPT pro. HPT pro outperforms GPT-4V and Gemini Pro on the MMbench and SEED-Image benchmark.

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework [Link]

Sora is the first video generation model, however it is not open-source. Lehigh University and Microsoft Research developed Mora to address the gap of no other video generation models to parallel with Sora in performance. Mora introduces an innovative multi-agent framework. As a result, Mora marks a considerable advancement in video generation from text prompts. The evaluation shows that Mora competes with Sora on most of the tasks, but not as refined as Sora in tasks such as changes in the video content, and video connectivity.

Substack

Streaming Giants Earnings - App Economy Insights [Link]

CrowdStrike has repeated in its investor presentations how it wants to be the leading ‘Security Cloud’ and emulate other category-defining cloud platforms:

  • Workday (HR Cloud).
  • Salesforce (CRM Cloud).
  • ServiceNow (Service Management Cloud).

Public cloud software companies are overwhelmingly unprofitable businesses. However, in FY24, Salesforce (CRM) demonstrated that margins can expand quickly once the focus turns to the bottom line (see visual). And when the entire business is driven by recurring subscription revenue and highly predictable unit economics, you are looking at a finely-tuned cash flow machine.

― CrowdStrike: AI-Powered Security - App Economy Insights [Link]

GRANOLAS: Europe’s Darlings - App Economy Insights [Link]

Oracle became TikTok’s cloud provider in 2020 for US users. With the risk of a TikTok ban in America, we’ll look at the potential revenue impact.

Catz believes this growth (of OCI revenue) is driven by:

  • Price Performance.
  • Full-stack technology for mission-critical workloads.
  • AI capabilities focused on business outcomes.
  • Deployment flexibility.
  • Multi-cloud offerings.

― Oracle: Cloud & AI Focus - App Economy Insights [Link]

Oracle services for enterprise software and cloud solutions: 1) cloud suite (cloud applications and services), 2) data analytics, 3) autonomous database, 4) enterprise resource planning (ERP) to improve operational efficiencies and integrated solutions to streamline complex business functions.

Key news highlights: 1) Oracle acquired Cerner in June 2022 which is a leading provider of electronic health records (EHR) and other healthcare IT solutions used by hospitals and health systems. Oracle is expanding cloud services including the upcoming launch of Ambulatory Clinic Cloud Application Suite for Cerner customers. 2) The adoption of Oracle Cloud Infrastructure (OCI) are across different segments: cloud natives customers such as Zoom, Uber, ByteDance looking for high price performance and integrated security and privacy, AI/ML customers looking for key differentiation, compute performance, and networking design, generative AI customers looking for control, data security, privacy, and governance. 3) TikTok is probably an essential component of the growth of OCI Gen2 infrastructure cloud services. 4) Oracle signed big Generation 2 Cloud infrastructure contract with Nvidia. 5) Oracle is a critical customer in Sovereign AI. It’s starting to win business per country for sovereign cloud, especially the cloud companies in Japan.

NVIDIA ‘AI Woodstock’ - App Economy Insights [Link]

We spent $700,000 on [our five-second Super Bowl ad] in total and yet earned over 60 million social media impressions.” [link]

― Duolingo: Gamified Learning - App Economy Insights [Link]

Duolingo launched the Duolingo Max subscription tier ($168/year), with Gen AI features enabling a more conversational and listening approach. Duolingo has leveraged AI in two areas: 1) using AI to create content, which allows it to experiment faster, 2) using AI to power spoken conversation with characters.

What is coming: Duolingo launched Math and Music courses into its app in 2023.

Articles

OpenAI and Elon Musk [Link]

Read arguments between OpenAI and Elon. Learned that Elon once believed there is 0% probability for OpenAI to succeed and wanted OpenAI to become for-profit so it can be merged to Tesla and being controlled by Elon himself.

0%