I finished reading Peter Thiel's 'Zero to One: Notes on Startups,
or How to Build the Future' today. With the rapid advancements and
widespread discussion around AI, the core arguments about technology,
human-machine collaboration, and the nature of progress hold up
remarkably well. And in some ways, as a manifesto for building a better
future, what's written in this book is even more relevant now.
Chapter 2 Party Like It's 1999 outlines four lessons
learned from the dot-com crash that became 'dogma' in the startup world,
however, Thiel argues that these dogmas are largely incorrect and that
the opposite principles are probably more correct:
Make incremental advances: Grand visions were
seen as bubble-inflating, so small, incremental steps became the
preferred path.
Thiel: It is better to risk boldness than triviality.
Stay lean and flexible: Planning was deemed
arrogant, and "agnostic experimentation" became the norm.
Thiel: A bad plan is better than no plan.
Improve on the competition: Focus on existing
customers and recognizable products, improving on what competitors
already offer.
Thiel: Competitive markets destroy profits.
Focus on product, not sales: If a product
requires advertising or salespeople, it's not good enough; viral growth
is the only sustainable growth.
Thiel: Sales matters just as much as product.
"The most contrarian thing of all is not to oppose the crowd, but to
think for yourself."
In this end this chapter, Thiel is challenging the reader to not
simply adopt the prevailing "lessons learned" from the past, but to
critically evaluate them. He suggests that true contrarianism isn't just
about disagreeing with the majority for the sake of it, but about
independent thought and forming your own conclusions, even if those
conclusions align with or contradict the crowd. It's about genuine
intellectual autonomy.
In Chapter 3 All Happy Companies Are Different, he
argues that successful companies are unique and that true value comes
from creating a monopoly rather than competing in existing markets.
Thiel uses the economic models of "perfect competition" and "monopoly"
to explain this difference. In perfect competition, firms sell identical
products, have no market power, and thus, in the long run, make no
economic profit as new entrants drive prices down. A monopoly,
conversely, owns its market, allowing it to set prices and maximize
profits due to a lack of close substitutes. He asserts that competition
is destructive, leading to a ruthless struggle for survival and zero
profits. Monopolies, on the other hand, can afford to focus on long-term
innovation, employee well-being, and broader societal impact because
they are not constantly battling for survival. Creative monopolies are
powerful engines for progress as they introduce entirely new categories
of abundance to the world.
He then discusses how both monopolists and non-monopolists tend to
misrepresent their market conditions. Monopolists (like Google) downplay
their dominance by broadly defining their market to avoid scrutiny,
while non-monopolists (like a new restaurant owner) narrowly define
their market to appear unique and avoid acknowledging intense
competition. Thiel emphasizes that losing sight of competitive reality
by focusing on trivial differentiators is a fatal mistake for
startups.
"All happy companies are different: each one earns a monopoly by
solving a unique problem. All failed companies are the same: they failed
to escape competition."
This is the core message from the book. Entrepreneurs should strive
to build unique, monopolistic businesses by creating something entirely
new.
Chapter 3 primarily focuses on the economic and strategic advantages
of monopoly and the destructive nature of perfect competition.
Chapter 4: "The Ideology of Competition" shifts the
focus to the societal and psychological impact of competition.
Competition is not merely an economic concept but a deeply ingrained
"ideology" that pervades our society, from education to personal
aspirations. He reminds readers that this competition can blind people
to real opportunities and lead to irrational behavior and missed
chances, and suggests us to recognize and resist the pervasive ideology
of competition.
Chapter 5 Last Mover Advantage discusses how a great
business is defined by its ability to generate future cash flows and
argues that being a last mover (i.e., to make the last great development
in a market and enjoy long-term monopoly profits) is more advantageous
than being a first mover. It outlines four characteristics of monopoly
that contribute to a company's durability:
Proprietary Technology: This makes a product
difficult to replicate, ideally being at least 10 times better than its
closest substitute (e.g., Google's search algorithms, PayPal's payment
system for eBay, Amazon's book selection, Apple's integrated
design).
Network Effects: The product becomes more
valuable as more people use it (e.g., Facebook). Thiel emphasizes that
such businesses must start with a very small, focused market to get
initial users.
Economies of Scale: Fixed costs can be spread
over increasing sales, making the business stronger as it grows.
Software companies are particularly suited for this due to near-zero
marginal costs.
Branding: A strong brand creates a monopoly
(e.g., Apple). However, branding needs to be built on substantive
advantages, not just surface-level polish.
"You've probably heard about 'first mover advantage': if you're the
first entrant into a market, you can capture significant market share
while competitors scramble to get started. But moving first is a tactic,
not a goal. What really matters is generating cash flows in the future,
so being the first mover doesn't do you any good if someone else comes
along and unseats you. It's much better to be the last mover— that is,
to make the last great development in a specific market and enjoy years
or even decades of monopoly profits. The way to do that is to dominate a
small niche and scale up from there, toward your ambitious long-term
vision. In this one particular at least, business is like chess.
Grandmaster José Raúl Capablanca put it well: to succeed, 'you must
study the endgame before everything else.'"
Thiel's advice for startups (1) start small and monopolize, 2) scale
up gradually, and 3) don't discrupt) reminds me of Google's AI strategy
in recent two years, which seems to align with the 'last mover
advantage' mentality. Instead of trying to release one massive,
all-encompassing AI that competes directly with established players
across every front, Google has released or integrated AI into many
"smaller" applications or features (workspace, photos, maps, gemini,
etc). Each of these can be seen as a "small market" or specific use case
where AI offers a distinct advantage, allowing Google to "monopolize"
that particular user experience. After establishing AI capabilities in
focused areas, they are integrating these more broadly. Successful AI
features in Workspace might then be leveraged for enterprise solutions.
Advancements in image recognition from Photos could be applied to
broader visual search or other AI models. The iterative development of
Bard/Gemini, starting as a conversational AI and gradually expanding its
capabilities (multimodality, coding, planning), is a clear example of
scaling up. They build upon established user bases and technological
strengths. While Google is certainly competing, their strategy doesn't
always seem to be about a direct, disruptive frontal assault that
immediately aims to destroy an incumbent. Instead, it's often about: 1)
leveraging their exsiting ecosystem, 2) focusing on unique capabilities,
3) creating new user behaviors.
In Chapter 6 You Are Not a Lottery Ticket, Thiel
described the concept of definite vs. indefinite futures and asserts
that the prevailing indefinite optimism, particularly in the US, is
unsustainable. He argues that real progress and success require definite
plans and individual effort.
“When Baby Boomers grow up and write books to explain why one or
another individual is successful, they point to the power of a
particular individual's context as determined by chance. But they miss
the even bigger social context for their own preferred explanations: a
whole generation learned from childhood to overrate the power of chance
and underrate the importance of planning."
The core of Chapter 7 Follow the Money applies the
power law to venture capital (VC). Venture returns are not normally
distributed (where most companies perform average). Instead, they follow
a power law: a small handful of companies radically outperform all
others, often returning more than the entire rest of the fund combined.
People often fail to see the power law, which is a fundamental law of
the universe, because it only becomes clear over time; early-stage
companies in a portfolio might look similar before exponential growth
kicks in. Despite being a niche (less than 1% of new businesses receive
VC funding), venture-backed companies disproportionately drive the
economy, creating 11% of private sector jobs and generating 21% of GDP.
The largest tech companies, all venture-backed, are worth more than all
other tech companies combined.
Understanding the power law means focusing on the singular, most
important things (e.g., one best market, one dominant distribution
strategy). To achieve disproportionate success, one must identify and
focus relentlessly on those few critical elements.
In Chapter 8 Secrets, Thiel begins by posing his
contrarian question ("What important truth do very few people agree
with you on?") in the context of secrets. He states that a good
answer to this question implies the existence of secrets – something
important, unknown, difficult, but achievable. He argues that secrets
still exist and are crucial for progress.Secrets can lead to monumental
advancements in science, medicine, and technology (e.g., curing
diseases, new energy sources). In business, secrets can lead to valuable
companies built on overlooked opportunities, like Airbnb (untapped
supply and unaddressed demand in lodging) and Uber/Lyft (connecting
drivers and riders). In terms of how to find secrets, Thiel has
discussed about 1) secrets of nature from studying physical world, vs.
secrets about people from understanding human nature, 2) looking at the
fields that matter but haven't been standardized.
Chapter 9 Foundations is around 'Thiel's law': a
startup messed up at its foundation cannot be fixed, providing guidance
on fundamental level: co-founder relationships, ownership, possession,
and control, small boards, full time commitment, equity is the king,
founding moment, etc. Chapter 10 The Mechanics of Mafia
highlights the importance of company culture. Chapter 11 If You
Build It Will They Come stresses that distribution (sales,
marketing, advertising) is often underestimated and is just as crucial
as product development.
"The founding moment of a company, however, really does happen just
once: only at the very start do you have the opportunity to set the
rules that will align people toward the creation of value in the
future.
The most valuable kind of company maintains an openness to invention
that is most characteristic of beginnings. This leads to a second, less
obvious understanding of the founding: it lasts as long as a company is
creating new things, and it ends when creation stops. If you get the
founding moment right, you can do more than create a valuable company:
you can steer its distant future toward the creation of new things
instead of the stewardship of inherited success. You might even extend
its founding indefinitely."
"'Company culture' doesn't exist apart from the company itself: no
company has a culture; every company is a culture. A startup is a team
of people on a mission, and a good culture is just what that looks like
on the inside."
There is a core debate right now around whether AI is going to
replace human‘s jobs, and this book offers powerful arguments for the
"AI as complement, not replacement" side. Thiel explicitly argued
against the "substitution fallacy" in Chapter 12 (Man and
Machine), stating that computers and humans have different
strengths and will thrive through collaboration. Although Generative AI
is unprecedented with its impact on human society nuanced to discuss, I
agree there are fundamental differences in intelligence between humans
and AI. Human possess intentionality, true innovation, empathy, and
emotional intelligence, and human judgment is needed when there are
ethical concerns or complex problems. AI as a tool can do augmentation
to increase productivity, but not automation. Historically speaking,
tech development always creates more jobs than destroyed. While some
roles are eliminated, new roles emerge: AI engineers, Prompt Engineers,
AI Product Managers, etc. In essence, it's about a redefinition of work,
rather than elimination.
"People compete for jobs and for resources; computers compete for
neither."
Globalization is about substitution. Technology is about
complementarity.
Chapter 13 Seeing Green analyzes the failure of the
cleantech bubble, attributing it to a widespread failure to answer the
seven critical questions every successful business must address.
The Engineering Question: Most offered only
incremental, not breakthrough (10x better), technology (e.g., Solyndra's
inefficient cylindrical solar cells).
The Timing Question: They misjudged market
readiness and the slow, linear progress of solar technology compared to
exponential tech.
The Monopoly Question: They pursued
"trillion-dollar markets" that were fiercely competitive, rather than
small, defensible niches.
The People Question: Teams were often led by
"salesman-executives" lacking technical expertise, focusing on
fundraising over product. (Thiel suggests a "never invest in a tech CEO
that wears a suit" rule.)
The Distribution Question: Companies often
overlooked effective distribution, leading to complex and inconvenient
sales models (e.g., Better Place's battery swapping).
The Durability Question: They failed to anticipate
competition (e.g., from China) or market shifts (e.g., the rise of
fracking).
The Secret Question: They based their ventures on
"conventional truths" (the need for a cleaner world), which everyone
agreed on, rather than unique, hidden insights.
"The 1990s had one big idea: the internet is going to be big. But too
many internet companies had exactly that same idea and no others. An
entrepreneur can't benefit from macroscale insight unless his own plans
begin at the micro-scale. Cleantech companies faced the same problem: no
matter how much the world needs energy, only a firm that offers a
superior solution for a specific energy problem can make money. No
sector will ever be so important that merely participating in it will be
enough to build a great company."
Chapter 14 The Founder's Paradox explores the often
extreme, contradictory, and seemingly peculiar traits of successful
founders, arguing that these unique characteristics are both powerful
for a company and carry inherent dangers for the founder. Society needs
founders – unusual individuals who can make authoritative decisions,
inspire loyalty, and plan long-term, moving companies beyond
incrementalism. However, founders must be wary of overestimating their
own power and succumbing to their own myth, mistaking public adulation
or criticism for truth. The greatest danger for a founder is losing
their mind; for a business, it's losing its myth and vision.
The current AI boom feels very much like an "accelerating takeoff "
in terms of technological advancement, which is mentioned in the final
chapter "Conclusion: Stagnation or Singularity", as one
of the Nick Bostrom's four possible patterns for humanity's future.
Accelerating Takeoff (Singularity) is the most difficult scenario to
imagine: new technologies so powerful that they transcend current
understanding, leading to a much better future. Ray Kurzweil's
"Singularity is near" concept, based on exponential growth trends, is
mentioned as a prominent view of this outcome. However, as Thiel's book
is a manifesto for building a better future and criticizes 'indefinite
optimism', in the context of AI boom, 'Singularity' is not a
predetermined destination, but the choices we make today:
Are we using AI for "0 to 1" innovation to solve truly hard
problems and create new value, or are we just using it for "1 to n"
incremental improvements and fierce competition?
Are we making definite plans for how AI will integrate with and
enhance human capabilities, or are we succumbing to "indefinite fears"
or blind optimism?
Are we building companies around unique AI-driven insights that
can create sustainable monopolies, or are we simply entering crowded AI
markets hoping for a piece of an existing pie?
The frontier isn’t volume—it’s discernment. And in that shift,
taste has become a survival skill.
Because when abundance is infinite, attention is everything. And
what you give your attention to—what you consume, what you engage with,
what you amplify—becomes a reflection of how you think.
What matters now is what you do with it. How you filter it. How
you recognize signal in the noise. Curation is the new IQ test.
Taste is often dismissed as something shallow or subjective. But
at its core, it’s a form of literacy—a way of reading the world. Good
taste isn’t about being right. It’s about being attuned. To rhythm, to
proportion, to vibe. It’s knowing when something is off, even if you
can’t fully articulate why.
Taste is what allows you to skim past the performative noise, the
fake depth, the viral bait, and know—instinctively—what’s worth your
time.
And that’s what real taste is: a deep internal coherence. A way
of filtering the world through intuition that’s been sharpened by
attention.
When you sharpen your discernment, you stop being swayed by
trends. You stop needing consensus. You stop reacting to every new thing
like it’s urgent.
There will always be creators. But the ones who stand out in this
era are also curators. People who filter their worldview so cleanly that
you want to see through their eyes. People who make you feel sharper
just by paying attention to what they pay attention to.
1995 interview with Steve Jobs — “Ultimately, it comes
down to taste. It comes down to trying to expose yourself to the best
things that humans have done, and then try to bring those things into
what you’re doing.”
Good taste isn’t restrictive. It’s expansive. It allows you to
contain multitudes without becoming incongruent.
But good taste is deep structure. It’s the throughline in
someone’s life. You can see it in the design of their home, the cadence
of their speech, the way they treat people, the books on their
shelves.Taste is how you live a congruent life. Not in the
sense of brand consistency, but in the sense of spiritual alignment. You
can change your mind. Explore new spaces. But your values stay intact.
Your center holds.
― Taste Is the New Intelligence - Wild Bare Thoughts
[Link]
This is an amazing article. In an age of infinite content, taste is
your compass. It’s not about elitism—it’s about aligning your attention
with what truly matters to you. We can do these:
Learnings and Suggestions:
Cultivate Discernment Over Consumption: Prioritize depth over
volume in what you read, watch, and engage with. Ask "Is this worth my
time?" before consuming content, creating something, or sharing. Trust
your intuition—if something feels off, skip it.
Curate Your Inputs (Because They Shape Your Outputs): Unfollow
accounts, mute topics, and unsubscribe from newsletters that don’t align
with your values. Follow thinkers, creators, and curators who
consistently offer depth. Set boundaries (e.g., no mindless scrolling
after 9 PM). Pause after reading/watching to digest, not just
react.
Build a "Library Mindset" (Not a Wishlist One): Read books,
essays, and long-form work that lingers. Don’t engage with viral content
just because it’s popular. Save/share only what resonates deeply—not
what’s merely entertaining.
Train Your Taste Like a Muscle: Study great art, writing, music,
and design to refine your sensibility. Remove distractions, unnecessary
commitments, and low-value inputs. Note what ideas/images/sounds stay
with you—these reveal your true taste.
Embrace Coherence Over Consistency: Your bookshelf, playlists,
and feeds should reflect who you are (or aspire to be). Stay open to new
influences, but filter them through your core principles. Don’t adopt
aesthetics/opinions for status—authenticity matters more.
Practice "Vibe Coding" (Like Rick Rubin): Whether in
conversations or creativity, prioritize feeling over formulas. In
work/life, strip away excess until only the essential remains. If
something feels "alive," lean in—even if it defies logic.
Reject Cheap Dopamine for Lasting Satisfaction: Opt for the book
over the tweet, the slow movie over the clip. After consuming something,
ask: Did this uplift or drain me? Regularly eliminate distractions
(apps, subscriptions, habits) that don’t serve you.
Taste as a Spiritual Practice: Prioritize art/ideas that
rearrange your perspective. From your home to your workspace, align
space with intention. Engage only with what nourishes, not
depletes.
Remember: Curation = Power: Amplify only what deserves a wider
audience. Your ability to filter signal from noise is a competitive
edge. The more you refine your taste, the more it protects you from
chaos.
A Primer on US Healthcare - Generative Value [Link]
This article covers an overview of the system (main players), the
value chain (how products and services flow through the system and what
profitable segments are), incentives (motivation of behaviors),
challenges (significant issues within the industry), and potential
solutions (software and AI).
It deeply focuses on the interplay between incentives, middlemen, the
resulting administrative burden, and AI as the specific technological
solution appears to be a key perspective.
BREAKING: UnitedHealth Bleeds. CEO Witty Steps Down. - Sergei
Polevikov, AI Health Uncut [Link]
UnitedHealth Abuse Tracker - Matt Stoller, American Economic
Liberties Project [Link]
Vibe coding is a new approach to software development that
utilizes AI tools to assist individuals in creating applications and
software without requiring extensive programming
knowledge.
The term was popularized by Andrej Karpathy, an AI expert, who
described it as a method where users interact with AI using
natural language to describe their ideas rather than writing
traditional code directly.
This allows creators, particularly those lacking technical
skills, to build functional applications rapidly by
simply explaining their requirements to the AI, which generates the
relevant code for them.
Who’s the Highest-Paid CEO? - App Economy Insights
[Link]
Rick Smith, co-founder and CEO of Axon.
I Summarized Mary Meeker's Incredible 340 Page 2025 AI Trends
Deck—Here's Mary's Take, My Response, and What You Can Learn - Nate, Ai
& Product [Link]
Nate's overall take is that while Mary Meeker is correct that
Generative AI adoption is exploding, real value accrues only where
organizations align real-world problems with AI’s actual strengths in
workflows. He believes bigger claims demand commensurately bigger
evidence.
Carl Dahlman later gave us the three categories that are widely
used today:
Search and information costs: discovering what
is available to purchase and comparing alternatives
Bargaining and decision costs: coming to an
agreement between buyer and seller, including establishing the final
price and terms
Enforcement and policing costs: ensuring that
both sides holds up their end of the deal
Distribution costs: actually getting the good
or service to the end consumer
― How To Build AI Agents (2025 Guide) - Max Berry, Max'
Prompts [Link]
Key Concepts:
Transaction Costs: Costs incurred in addition to
the actual price of a good or service, necessary to coordinate and
execute a transaction. Marketplaces primarily sell the reduction of
these costs.
TAM Expansion: Reducing transaction costs lowers
the effective cost of a good or service, increasing demand and expanding
the Total Addressable Market (TAM). The degree of TAM expansion relates
to the percentage of total cost eliminated.
Value Distribution: Marketplaces save sellers money
on transaction costs and charge them a fee (often similar to what
sellers paid previously). They typically pass efficiency gains on to
buyers in the form of easier and faster experiences, creating a
demand-constrained market. Variable Costs of Addressing
Transaction Costs:
Low Variable Costs: Addressing search and
bargaining costs is highly efficient and has low variable costs.
Marketplaces can keep more of the value created here.
High Variable Costs: Addressing enforcement and
distribution costs involves significant variable costs (e.g., funding
returns, building logistics). While these make marketplaces bigger,
margins may be lower as value is passed to buyers.
Takeaways:
This article puts the concept of transaction costs as central to
understanding marketplaces. Transaction costs are defined as the costs
incurred beyond the actual price of a good or service, associated with
coordinating and executing the transaction itself. Marketplaces are
essentially businesses that sell the reduction of these transaction
costs. Studying transaction costs can help determine where marketplaces
will succeed, what kind of marketplaces to build, and how to price
them.
Looking ahead, the article suggests that the "free lunch"
opportunities in many industries are exhausted, pushing marketplaces
into high variable cost activities. This implies future marketplaces may
be higher scale but potentially lower margin and more operationally
intensive. To disrupt incumbent marketplaces, one should look for
remaining transaction costs that can be addressed much more efficiently
than the current solution. The article suggests disrupting food delivery
was possible by building a more efficient network than restaurants had,
but disrupting shipping for handmade goods is harder because it requires
competing with highly efficient companies like UPS and Fedex.
By far, the largest unsolved transaction costs are in the
services industries (e.g., freelancing, home
improvement), which constitute two-thirds of consumer spending. Most
services marketplaces are currently stuck at the Lead Generation stage,
limiting penetration and take rate. This might be partly because much
spend is on recurring services where customers leave the marketplace
once a good provider is found, leading services marketplaces to rely on
high-churn consumer subscriptions. Despite this, there are
opportunities, such as Zillow exploring expanding into managed
marketplace territory for home services.
four-stages-of-marketplaces
An hour a day is all you need. - The Improvement
Journal [Link]
The one hour is suggested to be dedicated to three key practices that
aim to rebuild an individual from the ground up:
Build Something That's Yours: This involves
creating something that belongs to you, beyond your job, such as a
newsletter, product, service, blog, or by learning/teaching a skill. The
purpose is to "plant seeds" that will compound over time, pulling you
out of stagnation.
Train Like You Want to Be Here for a While: This
practice emphasizes physical strength and movement, like walking,
running, stretching, lifting, breathing, sleeping deeply, eating real
food, and drinking water. It's a message of self-care and an intention
to use one's body, which also sharpens mental clarity, as Seneca
suggested, "The body should be treated more rigorously, that it may not
be disobedient to the mind".
Create Enough Silence to Hear Your Own Voice: This
habit counters the constant noise and stimulation of modern life. It
encourages practices like journaling, meditating, taking walks without
headphones, or simply sitting still without a goal or screen. The goal
is not productivity, but presence and creating space for reflection and
insight, preventing thoughts from being drowned out and actions from
remaining unexamined.
How to become friends with literally anyone - April & The
Fool [Link]
The article suggests that becoming friends with anyone is to approach
interactions with a deep-seated belief in shared humanity, genuine
curiosity about individual worldviews, and an open, empathetic demeanor
that seeks to understand rather than judge.
Try to understand people through conversations with a belief
in the universal commonality of human nature: The author
fundamentally believes that all people are driven by the same core human
urges and desires, such as the need to feel loved, respected, and seen.
This perspective makes it intuitive to understand others. They see
meeting someone new as a "puzzle of empathy" and a "game of
commonality," where they try to understand what someone would think,
want, need, or crave given their background, values, limitations, and
longings.
From common nature to differences among people due to
environmental factors: The author acknowledges that
how these needs are defined and achieved varies dramatically
due to factors like nationality, gender, religion, cultural heritage,
socioeconomic class, hobbies, and upbringing. These "little big
differences" are where things become interesting, leading to unique
individual personalities and perspectives.
Follow the reasonableness within their personal worldview to
understand motivations and values: The author believes that
while people may not always be rational, they are always "reasonable"
within their own worldview. This means that everyone has reasons for
their actions, and those reasons make sense within their personal
framework. Understanding a person's circumstances allows the author to
understand their motivations, struggles, and values.
Be curious and genuine while engaging with people:
The author describes themselves as extremely extraverted, loving people,
and hating small talk. They are curious about people, viewing them as
containing "worlds, histories, stories that span across generations and
geographies". This curiosity leads them to give "rapt attention and
genuine space to be yourself".
Assumption of friendship from the beginning, share stories
and genuine care: The author approaches new encounters with the
assumption that "we are friends" from the moment they meet. They are
open, putting "all my cards on the table" and inviting others to reveal
theirs. They enjoy conversations, making people feel understood, heard,
and cared for.
The key takeaway is that genuine technological advancement, which
is real and accelerating, must be distinguished from the business models
built around it, which frequently adhere to age-old patterns. When
stated purpose and actual function align, it typically indicates that
the technology addresses a specific, measurable problem with clear
economic value, rather than promising to "transform
everything."
Systemantics, or, the art of understanding what’s going on, means
recognizing the persistent gaps between what systems proclaim and what
they actually do, and capitalizing on that insight.
When the fog dissipates and clarity emerges, the survivors will
be those who patiently deciphered the underlying mechanics amidst
fleeting illusions.
Enduring AI companies will emerge in two distinct spaces by 2035:
unglamorous but essential tools that demonstrably improve margins or
reduce costs, and genuine frontier research that reveals entirely new
problem spaces. The first category refines what exists; the second
invents what doesn't yet.
Real opportunities lie in the quiet spaces between stated
ambitions and operational truths. Just as they always have.
― The Art of Understanding What's Going On -
Fakepixels [Link]
How I Went From Reading 20 Books Per Year to Over 75 Books -
Read and Think Deeply [Link]
Takeaways:
Always take a book with you.
Give it about 50 pages before you quit. This keeps you from getting
stalled on a book that is not resonating with you.
Schedule the reading time. e.g., 45 min in the morning, 30 min in
the evening, and throughout the day when you get breaks.
Weekend sprints. Read in hour-long stretches on weekends or to do
several smaller stretches and get through entire sections or even whole
books on the weekends.
there are people with half your skills and intelligence living
out your dreams, just because they put themselves out there and didn’t
overthink it.
Reach out anyway—someone will always have more followers, more
free time, a better setup. It’s up to you to push through everything,
part the crowd, and make some space for yourself to at least give
yourself the chance of getting what you want.
You will never be fully ready and there will never be a perfect
time. It’s genuinely not about waiting for the right time to do
something when you’re ready, it’s about doing things before
you’re readyjust to make them exist.
Trends - Artificial Intelligence - Mary Meeker, Bond
[Link]
Think Only When You Need with Large Hybrid-Reasoning
Models [Link]
Large Reasoning Models (LRMs) improve reasoning via extended thinking
(e.g., multi-step traces), but this leads to inefficiencies like
overthinking simple queries, increasing latency and token usage. The
team Introduces Large Hybrid-Reasoning Models (LHRMs) — the first models
that adaptively choose when to think based on query complexity,
balancing performance and efficiency. They utilizes a two-stage
approach: 1) Hybrid Fine-Tuning (HFT) – cold start
using curated datasets labeled as "think" vs. "non-think"; 2)
Hybrid Group Policy Optimization (HGPO) – an online RL
method that trains the model to pick the optimal reasoning mode. They
defines Hybrid Accuracy to evaluate how well the model
selects between thinking and non-thinking strategies; correlates
strongly with human judgment. Experiments show LHRMs outperform both
LRMs and traditional LLMs in reasoning accuracy and response quality,
while also reducing unnecessary computation.
The 2025 State of B2B - Monetization - Kyle Poyar
[Link]
The report summarizes a poll of 240 software companies about their
pricing strategies. Key findings indicate a decline in flat-rate and
seat-based pricing models, with hybrid pricing (combining subscriptions
and usage) emerging as the dominant approach, especially for companies
incorporating AI capabilities. The report also highlights a growing
interest in outcome-based pricing among AI-native companies and stresses
the importance of pricing agility and clear ownership of pricing
strategy within organizations.
The Illusion of Thinking: Understanding the Strengths and
Limitations of Reasoning Models via the Lens of Problem Complexity -
Apple Machine Learning Research [Link]
The authors analyzed the thinking process and reasoning traces of
LRMs in several smart ways:
A custom pipeline using regex identifies and extracts potential
solution attempts from the LRM's thinking traces.
Extracted solutions are rigorously verified against puzzle rules and
constraints using specialized simulators for step-by-step
correctness.
Records the accuracy of valid solutions and their relative position
within the reasoning trace for behavioral insights.
Categorizes LRM thinking patterns (e.g., overthinking, late success,
collapse) by analyzing how solution correctness and presence vary with
problem complexity.
Examines how the proportion of correct solutions changes
sequentially within the thinking trace, revealing dynamic accuracy
shifts.
Pinpoints the initial incorrect step in a solution sequence to
understand the depth of correct reasoning before error.
Quantifies thinking token usage to analyze scaling of effort with
complexity, noting an unexpected decline at high complexity.
How much do language models memorize? - Meta, Google, NVIDIA,
and Cornell University [Link]
This paper proposed a new method to quantify how much information a
language model "knows" about a datapoint.
They formally separate memorization into two components by two novel
definitions of memorization: unintended memorization (information about
a specific dataset) and generalization (information about the true
data-generation process).
There are several interesting findings:
By training models on uniform random bitstrings (eliminating
generalization), they precisely measure model capacity, finding
GPT-style transformers store 3.5 to 4 bits per parameter.
Their framework shows that the double descent phenomenon occurs when
the data size exceeds the model capacity, suggesting that models are
"forced" to generalize when they can no longer individually memorize
datapoints.
The paper develops and validates a scaling law that predicts
membership inference performance based on model capacity and dataset
size, indicating that membership inference becomes harder with larger
datasets relative to model capacity.
To understand their smart methods:
They proposed a very clever approach to understand memorization
and model capacity in LM. They isolate unintended memorization by
training models on uniform random bitstrings.
No Generalization Signal: When training on truly random data, there
are no underlying patterns, rules, or structures for the model to
generalize from. Each bitstring is an independent, random piece of
information.
Only Memorization is Possible: In this scenario, the only
way for the model to "learn" or perform well on this data (i.e., predict
the next bit in a sequence or identify if it was part of the training
set) is to literally memorize the specific bitstrings it has seen. Any
"knowledge" the model gains is purely about the individual data
points.
Total Memorization as Measured: Therefore, when generalization is
effectively zero, the information the model stores about the random
bitstrings directly reflects its total memorization capacity
for that type of information. There's no "general knowledge" to
distinguish; it's all about remembering specific instances.
Therefore, they are measuring the maximum amount of distinct,
specific information the model can store.
They equal total memorization to model capacity. In machine learning,
model capacity generally refers to the size and complexity of
the functions a model is capable of learning. It's the model's
ability to fit a wide variety of patterns in the data. A model with
higher capacity can potentially fit more complex relationships or
memorize more specific data points.
The paper further quantifies this by showing that GPT-style models
have a capacity of approximately 3.6 bits-per-parameter. This indicates
that each parameter in the model effectively acts as a certain amount of
storage for information, reflecting the overall capacity of the neural
network architecture.
The fundamental challenge in understanding and evaluating
language models is the ambiguity and conflation of
"memorization" (copy or reproduce a specific sequence in the training
data) and "learning." (truly understand and generalize a pattern or
concept) This is exactly what they addressed by decomposing
memorization into unintended memorization and generalization. The
decomposition enables controlled measurement and the use of random
bitstrings is the key innovation.
About the Double Descent Phenomena: When a model's capacity exceeds
the generalizable patterns in the data, it starts to memorize individual
data points. As data size increases relative to capacity, the model is
"forced" to generalize more, leading to a decrease in unintended
memorization and an improvement in performance.
The core insight is that as models become massively overparameterized
(far beyond what's needed to simply fit the training data), they find
"simpler" interpolating solutions that generalize better, often due to
the implicit biases of optimization algorithms like Stochastic Gradient
Descent (SGD).
Intuition for double descent:
"Under-parameterized" Regime (Classical ML): Model Capacity <
Data Size: very generalizable, low test error
"Interpolation Threshold" (The Peak): Model Capacity ~ Data Size:
peak of test error, due to overfitting the noise
"Over-parameterized" Regime (Double Descent / Modern Deep Learning):
Model Capacity >> Data Size: robust generalization happens, test
error goes down again.
Here is a concept Membership Inference Attacks
(MIAs): These are attacks that attempt to determine whether a
specific data point was part of a model's training dataset or not. A
successful MIA indicates that the model has "memorized" that specific
data point. "Scaling Laws for Membership Inference" in the paper refers
to predictive relationships that describe how the success rate
of a MIA changes as a function of various model and data
characteristics, such as model capacity and dataset size.
They basically propose that membership inference success is
inversely related to how "generalizable" the data point is within the
model's capacity. In other words, membership inference is
easier when the model has high capacity relative to the
dataset size, leading to more unintended memorization of individual data
points. Membership inference becomes harder (i.e., less
successful for an attacker) when the dataset size is very large compared
to the model's capacity, or when the model starts to "grok" (generalize)
more.
LlamaFirewall: An open source guardrail system for building
secure AI agents - Meta AI [Link]
Reasoning models don't always say what they think -
Anthropic [Link]
The core issue explored is whether the Chain-of-Thought accurately
reflects the model's true reasoning process ("faithfulness"). The
article's research suggests that models frequently do not faithfully
report their reasoning. Researchers tested this by subtly providing
hints (both correct and incorrect) to models during evaluations. They
then checked if the models admitted using these hints in their
Chain-of-Thought explanations. The results showed that models (Claude
3.7 Sonnet and DeepSeek R1) rarely mentioned the hints they used, even
when those hints led them to a different answer or involved potentially
problematic information (like "unauthorized access"). Instead, models
often created fake rationales for their answers.
The researchers also studied reward hacking by constructing synthetic
Reinforcement Learning (RL) environments with injected "reward hacks."
They then finetuned an earlier snapshot of Claude 3.7 Sonnet in these
environments using outcome-based RL. By setting up the environments this
way, they could observe if the models learned to pick the incorrect hint
answer to maximize their reward, and then check if their
Chain-of-Thought verbalized this reward-seeking behavior. They found
that models learned the reward hacks very quickly (selecting the
incorrect hint answer over 99% of the time), but rarely verbalized these
hacks in their Chain-of-Thought (less than 2% of examples in most
environments).
Your Brain on ChatGPT: Accumulation of Cognitive Debt when
Using an AI Assistant for Essay Writing Task - MIT [Link]
This experimental study combines neuroscience (EEG), educational
psychology, and human-AI interaction to examine how AI tools like
ChatGPT influence cognitive engagement during essay writing. The result
shows that brain-only group showed stronger and broader neural
connectivity, especially in theta bands, indicating deeper internal
ideation and cognitive engagement. LLM group exhibited reduced alpha and
theta connectivity, suggesting externalized and narrower thought
patterns—relying more on ChatGPT suggestions rather than internal
generation of ideas. So it's saying if you rely heavily on AI, you will
get dumber.
YouTube and Podcasts
Sundar Pichai: CEO of Google and Alphabet | Lex Fridman
Podcast [Link]
Jared Isaacman: What went wrong at NASA | The All-In
Interview [Link]
Naval Ravikant On The 4 Books That CHANGED His Life
(Financially And Philosophically) [Link]
Chamath Palihapitiya: Zuckerberg, Rogan, Musk, and the
Incoming “Golden Age” Under Trump - Tucker Carison [Link]
Satya Nadella on AI Agents, Rebuilding the Web, the Future of
Work, and more - Rowan Cheung [Link]
Jeff Bezos: Amazon and Blue Origin | Lex Fridman Podcast -
Lex Fridman [Link]
WWDC25: Platforms State of the Union - Apple [Link]
IPOs and SPACs are Back, Mag 7 Showdown, Zuck on Tilt,
Apple's Fumble, GENIUS Act passes Senate - All-In Podcasts [Link]
Articles and Blogs
Everything Google Announced at I/O 2025 - WIRED [Link]
Launch Hugging Face Models In Colab For Faster AI Exploration
- Medium [Link]
My AI Skeptic Friends Are All Nuts - Thomas Ptacek
[Link]
The author argues that LLMs as agents are improving developer
productivity, and suggests that while the hype around AI can be
annoying, the technology's impact is real and profound. He believes that
those who don't embrace AI in their coding practices will be left
behind.
The author shares a disorienting sense of reality's erosion,
attributing it to various factors, including the relentless pace of
digital information, the overwhelming nature of political events, and
the insidious proliferation of AI. This environment fosters a collective
cognitive detachment and erosion of critical faculties, making it
challenging to discern truth, engage effectively, and maintain a
grounded sense of self and world.
New Tools
Meet the Foundation Models framework - Apple [Link]
The iPhone maker has launched the Foundation Models framework to
allow users to run a 3B parameter model locally. The framework is part
of Apple Intelligence suite and allows developers to access it using
three lines of code. The model can be used to generate text, extract
summaries, and tag structured information from unstructured text.
Users should be aware of strength and weakness. It's only available
on Apple Intelligence-enabled devices with OS version 26+. You need to
use Xcode Playgrounds to prototype with real model output. You can use
Instruments profiling template to measure latency and token overhead.
There is no support for fine-tuning or external model deployment
Connect Your MCP Client to the Hugging Face Hub -
HuggingFace [Link]
HuggingFace releases open-source MCP server to allow accessing its
tools from VSCode and Claude Desktop.
New Book List
Some book names from my daily readings recently caught my attention
and might be the next book to read for me:
"How Leaders Learn" by David
Novak is a great book for active learners. It has three
chapters: "Learn from", "Learn to", and "Learn by".
Active learners are like artists—constantly refining, adapting, and
evolving. They approach life as an masterpiece-in-progress,
understanding that each new insight adds depth and clarity to the bigger
picture. The book encourages active learning and defines it as a mindset
- a daily discipline of seeking out knowledge from people, experience,
and failures, staying open to feedback, new perspectives, and
uncomfortable truths, and taking actions to test ideas, adapting and
refining.
An active learner is somebody who seeks out ideas and insights and
then pairs them with action and execution. They learn with purpose. The
result is greater possibilities-for them and the people around them.
It's as Eric Hoffer, the American philosopher, wrote in Reflections
on the Human Condition: "In a time of drastic change, it is the learners
who inherit the future." They can't wait to discover the next idea, and
the next, and the next, because behind every idea is a world of
possibility and a brighter future.
Warren Buffett once told me what he looks for in the companies he
acquires. He said, "I'm looking to buy companies that are run by
painters." When I asked for an explanation, he said, "Most great artists
have a hard time letting go of their paintings. They're in love with the
painting. They are constantly adding a dab of color here, a little more
texture there. I'm looking for the boss who is always tweaking their
company, constantly trying to make it better. No matter how successful
they may have already been, what they still see is a
masterpiece-in-progress." He calls Berkshire Hathaway a museum for these
masterpieces, but he expects the people who run them to keep making
progress, to keep changing and expanding.
This book covers a lot of good practices, some of which I learned
through experience and have been implementing in daily life, but I've
never clearly summarized them in words like this author does (e.g.,
learn from failure and success, learn to ask better questions, learn to
develop pattern thinking, learn to reflect, learn by tackling problems,
etc); some are common sense to people but not easy to follow (e.g. learn
to see the world the way it really is, learn to make and check your own
judgments, learn by being your best self, learn by seeking new
challenges, learn by making everyone count, learn by recognizing on
purpose, etc); others are new ideas and wise advice to me that are
incredibly enlightening (e.g., learn from new environment, learn to
trust in positive intentions, learn to be humble and confident, learn by
simplying, learn by teaching, etc).
My Learnings:
I carry good values and get rid of bad values from my upbringings
and move on, but never go back and think about weakness and blind spots
that were developed implictly.
Our upbringings shape us-the good and bad experiences, the normal
experiences of our day-to-day lives. When you choose to learn from your
upbringing, you learn who you are, your strengths and weaknesses, your
unique perspective, and your blind spots.
I'm the type of person who stick to one thing or one job, do the
best, and get the most learnings from it - greedy but probably not the
most efficient approach. So this is the top one advice for me:
"Not moving means not growing" and
"Choose environment wisely and don't stand
still".
New environments bring uncertainty and risk, two things humans really
don't like. The brain weighs threats of loss heavier than it does
opportunities for gain. Whether it's a move to a new city or a move to a
new company, we don't know the people or the culture and we don't know
if we'll succeed when we get there. The brain tells us it's best if we
just stay where we are, in our more certain, less risky, known
environment. But that's not always the right choice. Josh Waitzkin,
child chess prodigy, subject of the book and movie Searching for Bobby
Fischer, and later a tai chi world champion, wrote in The Art of
Learning, "Growth comes at the expense of previous comfort or
safety.
However, not every new environment is good for you, it requires some
luck and judgment.
When looking at a new environment, evaluate it for these four sources
of learning:
New knowledge, skills, or systems
New ideas and innovative thinking
New people and their perspectives and opinions
New influences that lead to personal growth
Some new environments aren't going to advance your learning; they
might even slow you down.
First, make sure the new environment will offer opportunities to
learn and grow in any area that's important to you right now, like I
did. This is especially true when you have an ambition but aren't sure
how to get there.
As important as this work is, the next important step is to insert
ourselves in an environment filled with people who routinely do what
we're struggling to imagine." This is the whole point of choosing a new
environment.
Second, choose an environment that's suited to you. Understanding
your personal ideal environment is an important aspect of
self-awareness.
Third, choose an environment that will exert the right influences on
you, so that you're not only learning new skills, new knowledge, and new
ideas, but also absorbing better collaboration, better leadership,
better self-management, or whatever area of personal growth you think
you need to work on.
It's not only about growth, new environment can shape a person.
Our social and cultural environments have a huge impact on our
thinking and behavior. In Infuencer, psychologist Joseph Grenny and his
coauthors explain that if you want to change behavior, you have to make
changes to the social and structural environment. In Atomic Habits,
James Clear argues that our environments usually matter more than our
motivation when it comes to building new habits: "Especially over a long
time period, your personal characteristics tend to get overpowered by
your environment."
You can either fight that truth or leverage it to learn more and grow
more. Eric Gleacher recognized the power of environment and how it could
not only offer new skills but also shape the person he would become at a
surprisingly young age.
Have a look at what a getting-things-done talent looks like and
fill the gap. Although people all succeed in different ways and no one's
success is replicable, becoming a 'working genius' is at least a good
option to start.
Invention: creating novel ideas or solutions
Discernment: evaluating and analyzing ideas and situations
Galvanizing: organizing and inspiring others to take action
Enablement: providing encouragement and assistance
Tenacity: pushing projects to completion
If you're wondering who you should turn to, always start with people
who have applied their ideas in the real world and can prove that they
work.
Next, ask, Will they actually fill my gaps, or will they hold back
their best ideas or try to elevate their ego by making what they know
seem complex and hard to understand? Will they make their knowledge
simple and clear? Essentially, you're asking, Is this person an active
learner? Because active learners love helping people fill their
gaps.
A final tip: if you want people to share their know-how with you, you
need to spread know-how. You need to be willing to share with them,
too.
Human has instinct to avoid social pain or negative truth about
themselves, when someone tells a less positive truth, we need to fight
our instincts and always listen.
When somebody cares enough and is brave enough to tell you the truth,
your best course of action is to fight your instincts to dismiss it or
hide from it. Overcome your brain's biological drive to protect you.
Shut down the voice in your head telling you they're wrong. Don't run
out of the room. Take some deep calming breaths (that really works),
remind yourself that this person probably has a good reason for bringing
the truth to your attention, and listen.
Active learners work through this set of mental gymnastics every day.
They work on their humility and maintaining an open mind (more on this
in part two because they see the value truth-tellers bring.
Pursue the truth of the world, don't be delusional. Although
'we see the world as we are, not as it is' (Adaptation of Anaïs
Nin's famous quote), we at least should be aware of this.
Andy Pearson: "Learn to see the world the way it really is, not how
you wish it to be."
Darrow: "Chase after the truth like all hell and you'll free
yourself, even though you never touch its coat tails."
In their book Decisive, Chip Heath and Dan Heath explained that a
sound decision-making process is more important than data and analysis,
because no matter what, that data or our analysis of it is often flawed.
We interpret it based on what we wish or what we assume or what we
think, not what is.
Good process can lead to better analysis, they explained, but
analysis without good process won't produce the best learning. You need
both to orient yourself to reality.
When you see the world the way it really is, the right action becomes
very clear.
One of the best ways to be a better critical thinker is to make sure
that your information is as close to the source as possible. If you
don't go to the source yourself, you might be letting one perception
after another influence what you end up hearing or learning. You won't
know if you're seeing reality.
When you're trying to see the world the way it really is, it's
important to not be blinded by good news, something a good process can
help you overcome.
A great way to stay grounded is to not only chase the truth but also
deal in it. Active learners know the value of being honest and
transparent. They tell it like it is, because they know when they do,
there's a greater chance others will, too.
I love pattern thinking and I seek out actively, but I still
limit myself by a passive pursue of richer life experience.
To prepare to make that leap, active learners expose themselves to as
many patterns from as many disciplines as they can. Being curious about
the world around us in the hope that we'll discover a new way of
thinking about a problem or a new way of seeing an opportunity is core
to active learning. Active learners read, listen, travel, try new
things, explore hobbies and interests. They explore trends and insights
from different disciplines, industries, cultures. Then they apply what
they've absorbed to problems or goals. Those habits have helped me come
up with some of my most successful ideas.
You might think of a pattern-thinking moment as an aha moment or a
stroke of inspiration, but active learners don't wait for the moment to
hit them; they work to find it.
Peter Georgescu, chairman emeritus of advertising giant Young &
Rubicam and author of The Source of Success, said of pattern thinking,
"A creative solution is a leap, and that leap is supported and fed and
nurtured by experiences in life. The richer your life experience is, the
more creative you'll become."
About reflection and thinking, the book elaborates two modes:
focus mode and diffese mode. It resonates with me as I do see the
benefits of switching between data science work during the day time and
freestyle dancing in the evening in terms of developing creative ideas
and unstuck myself from difficult problems.
In her talk, she described two modes of thinking: focus mode and
diffuse mode. Focus mode is exactly what it sounds like. It's how we
think when we're trying to accomplish a task or memorize something. Our
thinking is usually confined to neural paths we've already created.
Diffuse mode is a more "relaxed set of neural states" that allows our
thinking to take off, range widely, and process or even create new
ideas. When we are learning, we need both. And when we feel stuck in our
thinking, unable to understand a concept, unable to unravel a challenge,
we especially need the diffuse mode.
A combination of confidence and humility is a good
characteristic. I've never thought about them deeply as a combo, that's
why I've never found the sweet spot.
Confidence is important because nobody will follow you unless they
believe you know where you're going and you'll find a way to get there.
If that confidence isn't tempered by humility, though, it becomes
arrogance.
Humility is just the recognition that you can't do it by yourself
whatever "it" is-either because you simply can't, because you don't know
enough, or because it won't be as fun or fulfilling if you go it
alone.
Confidence is simply the expectation that you'll find a way to
win-somehow.
People have good side and bad side. If you believe in their good
side, they do so. From another perspective, it's often not their fault
if they choose to express the bad side.
In any relationship, business or personal, somebody has to trust more
or trust first to break inertia and build up positive momentum.
As important as it is for us to trust in positive intentions, if we
want people to trust in ours, we need to behave accordingly. We need to
build a well of trust to draw on.
We're all human; we're all going to lose our tempers or handle a
delicate situation poorly or not show as much compassion as we should or
make a poor judgment call. When we're on the receiving end, if we can
take a breath, find a little empathy, and trust that the other person
has good intentions that didn't pan out, we can avoid a total breakdown
in the flow of ideas and learning and collaboration.
I read a striking definition of trust recently: "Trust is a
relationship of reliance." Aren't we all reliant on each other if we
want to learn, grow, and expand our possibilities? We can choose to
support that relationship or tear it down. If we choose the second
option, we're only limiting ourselves. If we choose the first, the
possibilities are infinite.
This is from my experience: I only think hard, struggle and
learn, when I'm dealing with my own unique life path, I don't take time
to think when I follow other's path or live to other's expectation.
You may know the quote often attributed to Oscar Wilde: "Be yourself;
everyone else is taken." (What he actually wrote is more cynical: "Most
people are other people. Their thoughts are someone else's opinions,
their lives a mimicry, their passions a quotation." Maybe because of my
background and the potential prejudgments that came with it, I've spent
most of my life working hard to just be me to understand who that person
is, the contributions I have to offer, what I believe, and my purpose
and passions. If I hadn't followed this path, I would have missed out on
so much learning.
Active learners know that it's hard to learn when your mental energy
is focused on trying to be somebody other than yourself. Instead of
being open and curious, you'll be defensive. You'll be putting up
barriers and withholding your brilliance. And then the people around you
will do the same. Most of us can sense when people aren't being
authentic, and it makes us trust them less.
Active learners like Marvin pursue authenticity by recognizing their
unique value and talents, figuring out what matters to them and why, and
then leveraging it to have a positive impact.
It's all about bringing who you are to the moment so that you're
comfortable and open-minded enough to learn important lessons and ideas
as they arise.
Everyone knows we need do the right things, but when it comes
difficult situations, would you like to prioritize it above all
else?
This is vital, because over time, depending on environments and
circumstances and your own choices, your sense of right and wrong can
suffer from stepwise degradation. You stray over the line, stray a bit
further the next time, justifying one bad action after another. Stray
too far over the line and you can lose sight of it entirely. Eventually,
you lose the ability to know what doing the right thing looks like.
The best thing that happens when we do the right thing is that we
feel good about our choices and the impact we're having on the world,
and that inspires us to keep doing the right thing. Values aren't some
thing you write down on a piece of paper and then put in a drawer or
hang on the wall. Values are something you use to take good action. It
isn't always the easy choice, but it's always the best choice and the
one that helps you learn the most powerful lessons.
Input and output are different things. We collect information by
inputing knowledge from outside, and we make sense of those knowledge by
neural-networking it within our brain and outputing it in a little
different way which requires our logical, critical, and creative
thinkings.
Two things happen in the brain that help us "learn what we know." One
is that we believe ideas more when we share them with others verbally,
especially if we're trying to convince others that they're true.
Psychologists call it the "saying is believing effect." Want to convince
yourself to make time to exercise three times a week? Try convincing
somebody else that they could fit a simple exercise regimen into their
schedule. Another is that speaking (and writing) brings a different part
of our brain into play than just thinking, which changes how we think
about an idea. It's one reason that we can struggle and struggle to come
up with a solution to a problem, but almost as soon as we explain the
problem to another person out loud, a good solution pops into our head.
Talking it out forces us to slow down, zoom out (simplify), and order
our thoughts.
Sometimes, it's audience's engagement and support force us further
along the learning journey.
I learned things I didn't know, and I learned what I already knew, as
Timo put it, as I analyzed leadership, considered it from different
angles, and expanded or supported my ideas. Active learners use this
process to codify their ideas into something digestible and easily
shared. When you codify it, you can scale it.
Teaching well also forces you to stay on top of your game, to
continually look for new material to keep your ideas current and
relevant. And it forces you to learn good storytelling, an invaluable
skill. Stories are stickier than almost any other kind of information.
If you want an idea to stay with people, you better be able to convey it
in a relevant, compelling story with emotion and tension.
Many know "people go first", few know how to do it. If you want
them to care about what you care about, you need to care about them
first.
Active learners understand that people-not knowledge or
results-should be the priority. How we support people, how we show our
gratitude for them, how we show our interest or concern for them has a
much greater impact, especially over time, than the latest quarterly
earnings or the latest market rankings. I've said it before: I really
like to win. But you don't win for long if the people who make the
winning possible don't know how much they count.
I have always admired Geoff Colvin, senior editor-at-large of Fortune
magazine and author of books like Talent Is Overrated and Humans Are
Underrated. When he joined me on my podcast, he described the kinds of
high-value work that only humans can do and that technology or AI can't:
empathy, collaboration, and the insights or learning we generate along
the way.
The greatest muscle you can build is urgency. Decrease the time
between having an idea and getting it done. Everything changes – Codie
Sanchez
You either chase your one big goal with everything you’ve got, or
nothing will happen. Trying to be balanced is what’s wrong with
society.
Success in any field comes from IMBALANCE.
Hard work only feels bad if it’s building someone else’s dream,
not yours.
The most important thing in your career: Speed.
If you answer emails fast, walk fast, talk fast, get sh*t done
fast, you will make a lot of money. No sense of urgency, you won't. Nick
Huber
― The Most Successful People I Know Have a Psychopathic Sense
of Urgency - Unfiltered by Tim Denning [Link]
UBER: Distribution is The King - Capitalist Letters
[Link]
Great business with potential of continuous growth and expansion, and
cheap stock price.
Ecosystem is a huge advantage because it creates cross-platform
efficiencies.
uber_business_model
The concept of network effect was first laid out in 1985, by Carl
Shapiro and Michael Katz in their seminal paper “Network
Externalities, Competition, and Compatibility.”
network_effect
According to Russ Harris, author of The
Happiness Trap, values are “how we want to be, what we want to stand
for, and how we want to relate to the world around us.”
Values are attributes of the person we want to be.
― How Successful People Timebox - Nir Eyal's
Substack [Link]
Identify your values -> turn values into time commitments ->
create a timeboxed calendar -> track distractions -> reflect and
refine weekly. Do remember to schedule fun activities, use flexible
categories, and be aware that the goal isn't finishing tasks.
Some key ideas in the article backed by behaviorial science:
According to The Happiness Trap by Russ Harris (Acceptance
and Commitment Therapy - ACT), productivity should align with personal
values (e.g., health, relationships, growth) rather than just task
completion.
People often ignore realistic time estimates in favor of optimistic
ones, leading to overpacked schedules and missed deadlines.
Especially in the Bay Area, the problem isn’t mediocrity—it’s
misdirected excellence. Kids under Chua’s parenting style rarely have a
choice in their own extracurriculars from elementary through high
school. (I doubt being vice-president of the National Honor Society is a
dream to most.) Sure, it can produce a passable overachiever who knows
how to get A’s. But to produce someone capable of real vision, high
agency, and contrarian thinking, the irony is that that overachiever may
be ill-prepared as we approach an era where AI handles rote tasks and
the knowledge economy demands more creativity.
― How to Raise High-Agency Kids - Rebecca Wang [Link]
True excellence and future success come from fostering
agency—self-directed purpose, curiosity, and ownership—rather than
forcing kids to conform to hyper-competitive, checklist-driven
achievement cultures (like those common in the Bay Area).
What's the root problem? - misdirected
excellence
What's the solution? - Give kids structure
(boundaries, values) but autonomy (freedom to pursue
interests).
This isn’t about faking confidence. It’s about understanding the
low-pressure way to join a group.
Our ability to notice intricate details allows us to ask the
specific questions that make others feel truly seen.
In a world where everyone is clamoring to be heard, the ability
to observe and truly listen becomes your superpower.
Robert Greene's The 48 Laws of Power completed the picture with
"Never Outshine the Master", a lesson teaching the power of blending in
rather than disrupting. Don't announce your presence; become part of the
scenery, then contribute when appropriate.
― The Spy Trick to Joining Any Conversation (Even If You're
Anxious) - AnifragileADHD [Link]
For neurodivergent individuals (ADHD, social anxiety, etc.),
socializing isn’t about performing—it’s about strategic observation and
gradual integration. This article is backed by psychology and behavioral
science.
Small tips:
Stand inside the group (not on edges) and listen silently at
first.
Linger quietly to blend into the social environment.
Wait for a group member to naturally include you.
Ask open-ended questions about others’ interests.
Sustain conversation with follow-up questions.
Articles and Blogs
Scientists discover quantum computing in the brain -
Brighter [Link]
This research bridges quantum physics, biology, and information
theory, suggesting that life evolved to exploit quantum
mechanics for survival and intelligence. It challenges
reductionist views of biology and could redefine our understanding of
consciousness, disease, and even the origins of life.
Here are the 19 US AI startups that have raised $100M or more
in 2025 - TechCrunch [Link]
Just as “internet” evolved from buzzword to business backbone, AI
is following the same playbook.
― In 2025, venture capital can’t pretend everything is fine
any more - Pivot to AI [Link]
Venture capital in 2025 is a dying industry clinging to AI as its
last hope, with most investment funneled into OpenAI and a few other
hyped players while the rest of the startup ecosystem collapses. The
sector, which thrived on zero-interest-rate euphoria, now faces a harsh
reality: no exits, frozen IPOs, and a market unwilling to fund
early-stage ventures. VCs blame Trump’s chaotic tariffs—despite many
having supported him—but the real issue is their own inability to adapt
to a normal economy. The NVCA report offers no solutions, just desperate
optimism, as the industry’s leaders—many of whom lucked into
success—flail in ideological fringe movements and pray for a miracle.
The only remaining question is whether AI will keep the bubble inflated
long enough for them to cash out before it all implodes.
The walled garden cracks: Nadella bets Microsoft’s
Copilots—and Azure’s next act—on A2A/MCP interoperability -
VentureBeat [Link]
Nadella’s endorsement signals Microsoft’s commitment to open
protocols over proprietary ecosystems, aligning with his long-standing
advocacy for interoperability (e.g., ONNX, GitHub’s multi-model
approach). By backing A2A (agent-to-agent communication) and MCP
(model-data context standardization), Microsoft ensures Copilot,
Foundry, and Azure AI can seamlessly integrate with third-party AI
agents and tools. This move preempts enterprise concerns about vendor
lock-in, a criticism of past Microsoft products.
Car Companies Are In A Billion-Dollar Software War, And
Everyone's Losing - InsideEVs [Link]
why it's so hard to shift from lagacy automaker to SDV (software
designed vehicle) company?
Cultural shift: Legacy automakers treated software as an
afterthought, not a core product. Now, they must adopt a Silicon
Valley-like approach.
Supplier dependence: Traditional automakers rely on suppliers for
ECUs, creating a tangled web of software layers.
Safety vs. agility: They must balance "move fast and break things"
with "zero defects or recalls."
Hybrid challenges: Slowing EV demand means SDV systems must also
work with internal-combustion vehicles, complicating power and update
logistics.
Legacy automakers must become software companies to survive, but the
transition is painfully slow and expensive. The winners will be those
who can blend Silicon Valley speed with automotive-grade
reliability—something no traditional automaker has fully achieved
yet.
8 Reasons Leadership Is Hard And Why Few Are Prepared To Lead
- Forbes [Link]
The most inspiring leaders today aren’t just adapting—they’re
rewriting the rules. Leadership isn’t a pinnacle; it’s
a daily practice of courage and reinvention. The world
doesn’t need more bosses; it needs architects of
possibility.
Summary:
The Myth of the Omniscient Leader
Shift: From "knowing it all" to curiosity-driven
collaboration.
Action: Adopt a "Learn It All" mindset (Microsoft’s Satya Nadella
famously replaced "Know It All" with this).
Tool: Host "No Answers Meetings" where leaders openly discuss
unsolved problems, inviting teams to co-create solutions. Example:
Google’s "20% Time" empowers employees to explore innovations beyond
their core roles, democratizing problem-solving.
Embracing the Illusion of Control
Shift: From command-and-control to adaptive
stewardship.
Action: Practice "Scenario Planning" (like Shell Oil’s famed
strategy) to prepare for multiple futures, not just one.
Mindset: View volatility as a laboratory for innovation. Spotify’s
"Fail Fast, Learn Fast" approach rewards experimentation.
Quote: "The art of leadership is not to control, but to
unleash." — Reed Hastings, Netflix.
The Leadership Pipeline Crisis
Root Cause: Short-term efficiency has gutted long-term talent
development.
Fix: Reverse Mentorship Programs (e.g., GE’s junior employees mentor
execs on digital trends).
Metric: Track "Readiness Ratios"—how many high-potentials are
prepared for next-level roles?
Warning: Deloitte’s research shows 89% of executives see "weak
leadership benches" as their top threat.
Tool: Use "Pre-Mortems" (anticipating failures before launch) to
stress-test strategies. Example: Blockbuster’s rigid playbook failed,
while Netflix’s pivot to streaming embraced uncertainty.
Respect as a Daily Earned Currency
Key: Authenticity > Authority.
Action: Practice "Radical Transparency" (like Bridgewater
Associates’ culture of brutal honesty).
Tool: Replace "All Hands Meetings" with "All Hearts Meetings"—forums
for empathy and vulnerability. Example: Edelman’s Trust Barometer shows
employees trust "a peer like me" 3x more than CEOs.
Rebuilding Trust in Judgment
Antidote: Inclusive Decision-Making.
Action: Form "Shadow Boards" (e.g., Gucci’s millennial council
advising execs).
Rule: For major decisions, require "Disagree & Commit" (document
dissent but align once decided). Example: Patagonia’s CEO involves
employees in sustainability bets, building trust through shared
stakes.
Titles vs. Influence
New Power Model: Fluid Hierarchies.
Action: Adopt "Holacracy Lite" (like Zappos’ role-based authority,
not title-based).
Symbolic Step: Drop "CEO" for "Chief Enabler" (as some startups do
to signal servant leadership).
Stat: 72% of Gen Z workers prefer "Project Leaders" over "Managers"
(McKinsey, 2024).
Tool: "Skills Gap Heatmaps"—quarterly self-assessments on emerging
competencies (e.g., AI literacy).
Example: Adobe’s "Kickbox" program gives employees $1,000 to test
new ideas, forcing leaders to adapt.
The Path Forward: Leadership as a Dynamic
Practice
Your closing question—"Will you be one of them?"—is the call
to action. Leaders who thrive will:
1. Lead with Questions, not answers.
2. Treat Trust as Currency, not a given.
3. Build Antifragile Teams (Nassim Taleb’s concept of growing stronger
through chaos).
4. Measure Success in Learning Cycles, not quarterly profits alone.
Microsoft Follows Competitors Amazon, Meta, and Google in
Employee Productivity Crackdown [Link]
The pandemic hiring spree, rising interest rates, and the AI arms
race have forced tech giants to abandon the "growth at all costs"
mindset. Instead, they’re:
Maximizing output per employee (via stack ranking
and attrition policies)
Investing savings into AI (where Microsoft is
battling Google and OpenAI)
Master The Psychology Of Building An Unforgettable Personal
Brand - Forbes [Link]
When your brand is rooted in internal conviction, it radiates
effortlessly. The right opportunities find you.
"My worth isn’t measured by likes; it’s measured by impact."
"If they don’t buy, it’s not a rejection—it’s a mismatch."
"Outcomes are data, not identity."
"Consistency today compounds into authority tomorrow."
Zero to One: Learning Agentic Patterns - Philschmid
[Link]
This guide explores techniques such as prompt chaining, routing,
parallelization, reflection, tool integration, planning, and multi-agent
collaboration. It features practical code examples for each pattern,
enabling the development of efficient, context-aware workflows with
Google DeepMind Gemini. Emphasis is placed on structured strategies to
enhance task delegation and agent coordination.
Our research shows that by 2030, data centers are projected to
require \(\text{\$6.7}\) trillion
worldwide to keep pace with the demand for compute power. Data centers
equipped to handle AI processing loads are projected to require \(\$5.2\) trillion in capital expenditures,
while those powering traditional IT applications are projected to
require \(\$1.5\) trillion in capital
expenditures (see sidebar “What about non-AI workloads?”). Overall,
that’s nearly \(\$7\) trillion in
capital outlays needed by 2030—a staggering number by any
measure.
To qualify our \(\$5.2\)
trillion investment forecast for AI infrastructure, it’s important to
note that our analysis likely undercounts the total capital investment
needed, as our estimate quantifies capital investment for only three out
of five compute power investor archetypes—builders, energizers,
and technology developers and designers—that directly finance
the infrastructure and foundational technologies necessary for AI growth
(see sidebar “Five types of data center investors”). Approximately 15
percent (\(\$0.8\) trillion) of
investment will flow to builders for land, materials, and site
development. Another 25 percent (\(\$1.3\) trillion) will be allocated to
energizers for power generation and transmission, cooling, and
electrical equipment. The largest share of investment, 60 percent (\(\$3.1\) trillion), will go to technology
developers and designers, which produce chips and computing hardware for
data centers. The other two investor archetypes,
operators, such as hyperscalers and colocation
providers, and AI architects, which build AI models and
applications, also invest in compute power, particularly in areas such
as AI-driven automation and data center software. But quantifying their
compute power investment is challenging because it overlaps with their
broader R&D spending.
― The cost of compute: A $7 trillion race to scale data
centers - McKinsey [Link]
est_global_data_center_cap_demand
The Comfortable Life is Killing You - Poetic Outlaws
[Link]
Meaning is forged in resistance - Meaning is a byproduct of
engagement with resistance. Joy emerges when we meet challenges worthy
of our souls. To paraphrase Camus: The struggle itself is
enough.
Agentic AI Is Already Changing the Workforce - Harvard
Business Review [Link]
Papers and Reports
The power of one: How standout firms grow national
productivity - McKinsey Global Institute [Link]
Productivity growth is crucial for economic prosperity. The report
suggests that instead of waiting for all firms to improve, targeted
support for high-potential firms could accelerate national productivity
gains.
Identifying and scaling AI use cases - OpenAI [Link]
OpenAI ads, but useful for pitching GenAI use cases. It offers
guidance on identifying and scaling AI use cases within organizations,
noting that AI adoption is rapidly increasing and demonstrating
significant benefits for early adopters. It emphasizes three key steps
for businesses: understanding where AI can add value by focusing on
repetitive tasks, skill bottlenecks, and navigating ambiguity; teaching
teams fundamental AI use cases like content creation, research, and
automation; and prioritizing opportunities using an impact/effort
framework to determine which projects to pursue and scale.
ZeroSearch: Incentivize the Search Capability of LLMs without
Searching - Alibaba Group [Link]
Traditional RL training requires massive API calls to services like
Google Search, costing hundreds of thousands of dollars. ZeroSearch
replaces this with a simulated search environment where
the LLM itself generates both relevant and irrelevant documents in
response to queries.
Real search engines return unpredictable results, complicating
training. While ZeroSearch uses curriculum-based
rollouts, gradually degrading document quality to teach the
model to discern useful information.
It has a cost reduction up to 88% and its performance surpasses real
search engines.
AI Global, Global Sector Trends on Generative AI [Link]
Gen AI Traffic Share update - Similarweb @Twitter [Link]
Subdomains and pages only (below).
genai_traffic_share
YouTube and Podcasts
Fed Hesitates on Tariffs, The New Mag 7, Death of VC,
Google's Value in a Post-Search World - All-In Podcast [Link]
The Physical Turing Test: Jim Fan on Nvidia's Roadmap for
Embodied AI - Sequoia Capital [Link]
This lecture introduces the Physical Turing Test, a new
benchmark for robotics. Jim Fan from NVIDIA breaks down why solving this
is hard—and what tools researchers are using to make progress.
5 Types of AI Agents: Autonomous Functions & Real-World
Applications - IBM Technology [Link]
This lecture covers reflex agents, model-based agents, goal-based
systems, utility-based frameworks, and learning agents.
Stanford Webinar - Agentic AI: A Progression of Language
Model Usage - Stanford Online [Link]
How to connect AI agents to third-party tools using MCP -
Underfitted [Link]
Llamacon 2025 - Conversation with Mark Zuckerberg and Satya
Nadella - Meta Developers [Link]
AI mode finally - Smart move to embrace next-gen search. Android XR
glass is launching, and Gentle Monster + Warby Parker will be the first
eyewear partners. Genimi App has Agent mode is coming. And many
more!
NVIDIA CEO Jensen Huang Keynote at COMPUTEX 2025 -
NVIDIA [Link]
NVLink Fusion, DGX Spark AI Computer, DGX Station Super Computer, FTX
Pro Server, AI Robotics, etc.
I'll tell you my hiring experience. We have about 30 people at
8090 and the way that I have found it to work the best is you have
senior people act as mentors and then you have an overwhelming corpus of
young very talented people who are AI native. And if you don't find that
mix, what you have instead are L7s from Google and Amazon and Meta who
come to you with extremely high salary demands and stock demands and
they just don't thrive. And part of why they don't thrive is that they
push back on the tools and how you use them. They push back on all these
things that the tools help you get to it faster. M this is why I think
it's so important for the young folks to just jump in with two feet and
be AI native from the jump because you're much more hirable frankly to
the to the emergent company and the bigger companies you'll have a lot
of these folks that see the writing on the wall may not want to adapt as
fast as otherwise. Another way for example that you can measure this is
if you look inside your company on the productivity lift of some of
these coding assistants for people as a distribution of age. What you'll
see is the younger people leverage it way more and have way more
productivity than older folks. And I'm not saying that as an aegis
comment. I'm saying that it's an actual reflection of how people are
reacting to these tools. What you're describing is a paradigm shift. It
is a big leap. Is you know it's like when I went to college, when I took
computer science, it was object-oriented programming. It was like C++.
It was compiled languages. It was gnarly. It was nasty work. And then
you had these highle abstracted languages. And I used to remember at
Facebook, I would just get so annoyed because I was like, why is
everybody using PHP and Python? This is like not even real. But I was
one of these old lights who didn't understand that I just had to take
the leap. And what it did was it grew the top of the funnel of the
number of developers by 10x. And as a result, what you had were all of
these advancements for the internet. And I think what's happening right
now is akin to the same thing where you're going to grow the number of
developers upstream by 10x. But in order to embrace that, you just have
to jump in with two feet. And if you're very rigid in how you think the
job should be done technically, I think you're just going to get left
behind. - Chamath Palihapitiya
― AI Doom vs Boom, EA Cult Returns, BBB Upside, US Steel and
Golden Votes - All-In Podcast [Link]
A Parquet file is composed of Row Groups, Column Chunk, and
Pages.
Parquet is a self-described file format that contains all the
information needed for the application that consumes the file. This
allows the software to efficiently understand and process the file
without requiring external information. Thus, the metadata is the
crucial part of Parquet. They include Magic Number, FileMetadata, and
PageHeader.
Google Dremel (the query engine behind BigQuery) inspired
Parquet’s approach to implementing nested and repeated field storage. In
a 2010
paper introducing Dremel, Google detailed its method for efficiently
handling nested and repeated fields in analytics workloads using
definition level (for nested fields) and repetition level (for
array-like fields). I wrote an article about this approach seven months
ago; you can read it here:
― I spent 8 hours learning Parquet. Here’s what I discovered
- Vu Trinh [Link]
The overall BigQuery architecture includes independent components
for query execution, storage, a container management system, and a
shuffler service:
Colossus: A distributed storage system that
holds and stores data.
Dremel: The distributed query engine.
Borg is Google’s large-scale cluster management
system that can reliably manage and orchestrate compute resources. (Borg
is the predecessor of Kubernetes.) We will return to Borg when
discussing the Vortex architecture.
Dedicate shuffle service: Dremel was inspired
by the map-reduce paradigm to operate and manage the data shuffle
between stages efficiently; Google built a separate shuffle service on
top of disaggregated distributed memory. This service backs BigQuery and
supports other services, such as Google
Dataflow.
― I spent 4 hours learning the architecture of BigQuery's
storage engine - Vu Trinh [Link]
Extract: The process’s first step is
extraction. The needed data is gathered from various sources, such as
relational databases or third-party APIs
Transform: Extracted data undergoes many
potential transformations, including cleaning, filtering, combining from
different sources, and formatting to conform to a target
schema.
Load: The transformed data is loaded into the
destination with the predefined schema and constrained.
ELT solves many of the problems associated with ETL.
Most transformation logic can now be handled within the data
warehouse using SQL, making it more accessible for users such as data
analysts or data scientists. This eliminates the potential performance
bottleneck of ETL pipelines.
Most importantly, ELT allows you to keep raw data in the
warehouse. This approach offers several advantages. You don’t need to
plan transformation logic in advance; instead, the logic can evolve over
time based on analytical needs—an especially valuable benefit in today’s
agile software development environment.
Salesforce & AI Strategy - Generative Value [Link]
This article discusses the history of Salesforce, what made it
successful, the state of the business, and the AI opportunity (or
threat) today.
Everything Wrong with MCP - Shrivu's Substack [Link]
How to future-proof your career in the age of AI - Operator's
Handbook [Link]
Key Takeaways:
The author’s call to "lean into human strengths while actively
engaging with AI" is a compelling middle path. The essay underscores
that the future belongs to those who combine AI literacy with
irreplaceable human skills—judgment, influence, and adaptability.
Human Competitive Advantages:
Judgment & Conviction: Ability to make
decisions with incomplete/ambiguous data. Distinguishing impactful work
from "interesting but useless" projects. Simplifying complexity into
actionable frameworks.
Influence & Execution: Navigating
organizational politics and incentives. Building trust and adoption for
AI-driven outputs. Understanding unspoken processes and
relationships.
Actionable Skills to Cultivate:
Develop "taste" by studying excellence in your field.
Gain hands-on experience to pressure-test AI outputs.
Learn to align stakeholders and drive consensus.
Build strong interpersonal relationships and reputation.
Adaptability as the Ultimate Skill:
AI will keep evolving, so continuous learning and flexibility are
critical.
Focus on areas where humans add unique value (judgment, influence,
creativity).
This is a very interesting point: "Develop "taste" by studying
excellence in your field."
Just like any skill, taste sharpens with exposure and effort. The
more you study, critique, and create, the better you’ll get at
recognizing—and producing—excellence. In a world flooded with
AI-generated content, the people who thrive will be those who can
separate the remarkable from the mediocre.
Blogs and Articles
How Airbnb Standardized Metric Computation at Scale - Airbnb
Blog [Link]
Good tips and tricks for digital hygiene, given the pervasive nature
of internet fraud and the data collection practices of major tech
companies.
Measuring AI Ability to Complete Long Tasks - METR
[Link]
metr-length-of-tasks-log
The "think" tool: Enabling Claude to stop and think in
complex tool use situations - Anthropic [Link]
Anthropic introduces a "think" tool designed to enhance Claude's
complex problem-solving by providing a dedicated space for structured
reasoning during tasks. This tool differs from extended thinking by
allowing Claude to pause and consider necessary information
mid-response, particularly beneficial for multi-step processes and tool
use. Evaluations on benchmarks like τ-Bench demonstrated significant
performance improvements, especially in policy-heavy domains like
airline customer service, where optimized prompting alongside the
"think" tool proved most effective.
Tiny Agents: a MCP-powered agent in 50 lines of code -
HuggingFace [Link]
Anthropic CEO wants to open the black box of AI models by
2027 - Techcrunch [Link]
Powerful AI will shape humanity’s destiny, and we deserve to
understand our own creations before they radically transform our
economy, our lives, and our future.
― The Urgency of Interpretability - Dario Amodei [Link]
Interpretability isn’t just academic—it’s a prerequisite for safe,
controllable AI. The window to solve it is narrowing as AI grows more
powerful. By steering resources toward this goal now, we might avoid a
future where humanity builds systems it doesn’t understand but can’t
afford to stop.
The Jobs That Will Fall First As AI Takes Over The Workplace
- Forbes [Link]
Takeaways:
Timeline for Disruption:
By 2030: 30% of U.S. jobs could be automated (McKinsey).
By 2035: White-collar restructuring in finance, legal, and media
(Larry Fink, Jamie Dimon).
By 2045: 50% of jobs may be fully automated (Goldman Sachs).
By 2050: AI could dominate 60-80% of jobs, depending on innovation
pace.
Most Vulnerable Jobs (Near-Term):
Administrative: Data entry, scheduling, customer service (60%
automatable, per IPPR).
Finance & Legal: Bookkeeping, contract drafting, paralegal work
(AI tools like Harvey already achieve 90% accuracy).
Creative & Media: Basic graphic design, copywriting, journalism
(30% at risk by 2035, Pew Research).
Routine STEM Tasks: Coding, data analysis (40% automatable by 2040,
WEF).
More Resilient Jobs (Longer-Term):
Healthcare: Nursing, therapy, and patient care (empathy-driven
roles).
Skilled Trades: Construction, repair, maintenance (physical labor is
harder to automate).
Focus on critical thinking, creativity, and AI collaboration (e.g.,
prompt engineering, AI-augmented decision-making).
Target Resilient Sectors- Healthcare, education, skilled trades, and
AI-adjacent roles (e.g., cybersecurity, AI ethics).
Push for employer or government-sponsored programs to transition
into hybrid (human + AI) roles.
Embrace Hybrid Roles- Jobs that combine technical skills with human
judgment (e.g., AI-assisted healthcare diagnostics) will thrive.
As Ray Dalio warns, the economy faces a "great deleveraging" where AI
disrupts jobs faster than new ones emerge. The key is
adaptability—those who proactively reinvent their
skills today will shape the workforce of tomorrow.
Curation is the new leadership superpower. Here are 3 ways to
adopt a curation mindset - FastCompany [Link]
The most transformative leaders of the next decade will be those who
master the art of curation—seeing their role as a conduit for the best
ideas, not the source of them.
The Obsolescence of the "Omniscient Leader": The pace of change,
hyper-specialization, and interconnected challenges (e.g., AI, climate,
global markets) make it impossible for one person to have all the
answers. Leaders must shift from being "the smartest in the room" to
becoming "architects of collective intelligence."
Curation as the Core Leadership Skill:
Curating Talent: Prioritize cognitive diversity over homogeneity.
Example: Diverse teams solve problems faster (39% efficiency
boost).
Curating Ideas: Create systems where unconventional thinking
flourishes (e.g., Google’s 20% time → Gmail, Maps). Actively seek
"outliers" (contrarians, outsiders) to challenge groupthink.
Curating Innovation: Design for "structured serendipity" (e.g.,
Pixar’s open office, IDEO’s cross-industry brainstorming). Embrace
cross-disciplinary collisions (e.g., NASA’s tech inspiring sportswear,
biomimicry in architecture).
How to Cultivate a Curation Mindset:
Facilitate, don’t dictate: Ask better questions; let solutions
emerge from debate (e.g., Amazon’s "Disagree and Commit").
Optimize for collaboration, not just efficiency: Space matters
(physical or virtual).
Perplexity CEO says its browser will track everything users
do online to sell ‘hyper personalized’ ads - TechCrunch [Link]
Perplexity is building a browser (Comet) to track user behavior
across the web—explicitly to fuel targeted advertising. It highlights
the company’s ambition to emulate Google’s surveillance-capitalism
playbook.
Perplexity’s move confirms that the AI search revolution is less
about displacing Google’s model than replicating it—with AI as a smarter
wrapper for the same ads.
Today’s Most Crucial Leadership Skill Is Systems Thinking -
Forbes [Link]
Leaders who master systems thinking don’t just survive
uncertainty—they thrive in it, turning complexity into
competitive advantage.
Five Key Tools of Systems Thinking for Strategic
Leaders
Problem Statements: Move from surface-level fixes to systemic
solutions. Example: Instead of asking, “How do we get customers to
recycle?”, ask, “How can we redesign products and
infrastructure for circularity?”
Stakeholder Mapping: Identify all affected parties—not just obvious
ones. Example: For electric vehicles, consider miners of critical
minerals, urban planners, and regulators, not just automakers and
buyers.
Iceberg Analysis: Look beneath visible events to uncover hidden
structures and mindsets. Example: Employee burnout isn’t just about
workload—it’s shaped by corporate culture, incentive systems, and
societal norms.
Causal Loops: Visualize feedback loops to see how actions create
ripple effects. Example: A cost-cutting measure in one department may
increase inefficiencies elsewhere.
Iteration & Testing: Embrace adaptive strategies, not rigid
plans. Example: Pilot small-scale solutions, measure impact, and refine
before full rollout.
Perplexity CEO shares the Elon Musk–inspired mantra that
helped him build the $9 billion rival to OpenAI - Fortune [Link]
Srinivas’s journey highlights resilience, speed, and Silicon Valley’s
tight-knit founder network as key drivers of startup success.
"It’s Only Over When You Give Up" – Aravind Srinivas, CEO of AI
search startup Perplexity, draws inspiration from Elon Musk’s
perseverance during SpaceX’s early failures. He told Harvard students
that success comes from relentless self-belief, even when others doubt
you.
Rocketing Valuation – Perplexity, competing with Google and OpenAI,
grew from a 1B to 9B and is now in talks to raise funds at an 18B
valuation.
Forget Pitch Decks, Build Fast – Srinivas advises founders to focus
on rapid product iteration rather than lengthy business plans. He admits
he doesn’t even know how to make a pitch deck—Perplexity’s success came
from live demos.
OpenAI Alumni Network – Despite competing with OpenAI, Srinivas
maintains a strong relationship with Sam Altman (his former boss at
OpenAI). This mirrors the "PayPal Mafia" dynamic, where ex-OpenAI
employees now lead major AI firms like Anthropic and Safe
Superintelligence.
Marc Andreessen predicts one of the few jobs that may survive
the rise of AI automation - Fortune [Link]
Andreessen’s logic suggests focusing on roles where trust,
psychology, and networks matter more than data crunching. But don’t
underestimate AI’s ability to creep into those domains too.
How To Get Noticed Without Self-Promotion By Using Strategic
Visibility - Forbes [Link]
Core Lessons:
Hard Work ≠ Visibility: Doing great work is
necessary but insufficient. If leaders don’t know what you’re doing,
they can’t reward it. Waiting for annual reviews is too late—visibility
requires consistent, intentional updates.
Humility Has a Hidden Cost: While modesty is
admirable, staying silent can render you invisible. Gallup’s data on
declining engagement (just 36% in 2020) highlights how disengagement
hurts promotion prospects. Visibility isn’t ego-driven; it’s about
ensuring your impact is recognized.
Visibility ≠ Bragging: Framing contributions as
useful knowledge (e.g., "Here’s how I solved X") builds trust and
leadership credibility. Sharing wins, failures, and best practices helps
the team and positions you as a problem-solver.
Tactical Ways to Increase Visibility
Share knowledge: Lead "lessons learned" sessions or contribute to
internal newsletters.
Mentor others: Their success reflects your leadership.
Speak up strategically: One substantive insight per meeting >
empty chatter.
Volunteer for high-impact projects: Align with organizational
priorities.
Write internally: Document best practices to showcase thought
leadership.
Emotional Intelligence (EQ) Matters More Than
Extroversion
Visibility is about meaningful engagement, not being the
loudest.
Avoid self-deprecating language ("I’m sorry, but…")—speak with
conviction.
What Leaders Actually Notice
Initiative, influence, and alignment with goals matter more than
face-time.
Working smart (not just late) and collaborating effectively signal
leadership potential.
YouTube and Podcast
DOGE updates + Liberation Day Tariff Reactions with Ben
Shapiro and Antonio Gracias - All-In Podcast [Link]
2027 Intelligence Explosion: Month-by-Month Model — Scott
Alexander & Daniel Kokotajlo - Dwarkesh Patel [Link]
Trump vs Harvard, Nvidia export controls, how DEI killed
Hollywood with Tim Dillon - All-In Podcast [Link]
How DeepSeek Rewrote the Transformer [MLA] - Welch
Labs [Link]
A lecture explaining the architecture and optimizations behind
DeepSeek R1, a language model that improves Transformer efficiency.
Live Demo: Reinforcement Fine-Tuning for LLMs — Build Smarter
Models with Less Data l Tutorial - Predibase [Link]
This video was talking about why RFT beats supervised fine-tuning
(SFT) in reasoning tasks, giving live demo of an end-to-end RFT
workflow, and PyTorch-to-Triton case study showing real-world
impact.
Model Context Protocol (MCP), clearly explained (why it
matters) - Greg Isenberg [Link]
Trump Rally or Bessent Put? Elon Back at Tesla, Google's
Gemini Problem, China's Thorium Discovery - All-In Podcast [Link]
Suffering is mostly mental anguish and mental pain and it just
means you don't want to do the task at hand.
The kind of fame that pure actors and celebrities have, I
wouldn't want, but the kind of fame that's earned because you did
something useful, why dodge that.
People will always want more status uh but I think you can be
satisfied at a certain level of wealth.
Not the kind of confidence that would say I have the answer but
the kind of confidence that I will figure it out and I know what I want
or only I am a good arbiter of what I want.
Pride is the enemy of learning, so when I look at my friends and
colleagues, the ones who are still stuck in the past and have grown the
least are the ones who were the proudest, because they sort of feel like
they already had the answers and so they don't want to correct
themselves publicly.
I think everybody puts themselves first that's just human nature,
you're here because you survive you're a separate organism.
The happier you are, the more you can sustain doing something,
the more likely you're going to do something that will in turn make you
even happier, and you'll continue to do it, and you'll outwork everybody
else. The more free you are the better you can allocate your
time.
There are no problems in the real world other than maybe things
that inflict pain on your body. Everything else has to become a problem
in your mind first.
Your family is broken but you're going to fix the world. People
are running out there to try and fix the world when their own lives are
a mess.
I think the only true test of intelligence is if you get what you
want out of life and there are two parts to that one is getting what you
want so you know how to get it and the second is wanting the right
things knowing what to want in the first place.
Usually I think people end up there because they are going on
autopilot with sort of societal expectations or other people's
expectations or out of guilt or out of like mimetic desire.
Probably the biggest regret will be staying in the relationship
after you knew it was over, exactly you should have left sooner, the
moment you knew it wasn't going to work out, you should have moved
on.
We are naturally hardwired to be pessimists but modern society is
very different despite whatever problems you may have with modern
society, it is far far safer than living in the jungle and just trying
to survive and the opportunities.
Leave all those labels alone. It's better just to look at the
problem at hand, look at reality the way it is, try to take yourself out
of the equation in a sense.
The less you think about yourself the more you can think about a
mission or about God or about a child or something like that.
I don't think there are any formulas i think it's unique to each
person it's like asking a successful person how did you become
successful each one of them will give you a different story uh you can't
follow anyone else's path.
A lot of change is more about desire and understanding than it is
about uh forcing yourself or trying to domesticate yourself.
When your mind is under stress, it's because it has two
conflicting desires at once... and anxiety I think is sort of this
pervasive unidentifiable stress where you're just kind of stressed out
all the time and you're not even sure why and you can't even identify
the underlying problem. I think the reason for that is because you have
so many unresolved problems unresolved stress points that have piled up
in your life that you can no longer identify what the problems
are.
Life is going to play out the way it's going to play out there
will be some good and some bad most of it is actually just up to your
interpretation.
The gut is what decides the head, is kind of what rationalizes it
afterwards, the gut is the ultimate decision maker.
You can't change other people, you can change your reaction to
them.
If you do want to change someone's behavior, I I think the only
effective way to do it is to compliment them when they do something you
want, not to insult them or be negative or critical when they do
something you don't want.
If you can't decide, the answer is no.
Almost invariably the advice that you would give yourself 10
years ago is still the advice that you need to hear today.
On mental things, I think understanding is way more important
once you see the truth of something you cannot unsee it... when we
really do see something clearly, it changes our behavior immediately,
and that is far more efficient than trying to change your behavior
through repetition.
Truth is often painful, if it wasn't, we'd all be seeing truth
all the time. Reality is always reflecting truth that's all it is why
would you not have accessed it already exactly... wisdom is the set of
things that cannot be transmitted. If they could be transmitted you know
we'd read the same five philosophy books, and we'd all be done, we'd all
be wise. You have to learn it for yourself, it has to be rediscovered
for yourself in your own context.
You're probably better off only caring about things that are
local or things that you can affect. So if you really care about
something that's in the news, then by all means care about it but make a
difference go do something about it.
Desire is a contract to be unhappy until you get what you
want.
The real currency of life is attention it's what you choose to
pay attention to and and what you do about it.
― 44 Harsh Truths About Human Nature - Naval Ravikant (4K) -
Chris Williamson [Link]
Key Learnings:
Someone who can do the job peacefully or happily is more effective
than someone with unnecessary emotional turmoil
Fame sought for its own sake is fragile and leads to a constant need
to perform.
People often say things they don't really believe, driven by a
desire to be seen as something they are not.
Status is zero-sum and insatiable, unlike wealth. Status is often
comparative, like leaderboards, where one person's gain can be another's
loss.
Self-esteem comes from aligning actions with internal values,
especially when difficult. Genuine sacrifice, doing something you want
less for something you value more, can build self-esteem.
True confidence is not having all the answers but the self-belief to
figure things out.
Pride is an enemy of learning and can lead to being stuck in past
mistakes.
Everyone puts themselves first; unapologetic self-prioritization is
rare but perhaps more honest. Much of what appears as altruism might be
a waste of time if it goes against one's true desires.
Happiness and freedom are intertwined with efficiency and
productivity.
Many emotional problems arise from the mind creating problems where
none exist in the real world. He advises observing one's thoughts
objectively to realize unnecessary emotional energy expenditure.
People often try to fix the world while their own lives are in
disarray. He questions the credibility of those who cannot manage their
own lives but seek to solve global issues.
True intelligence is getting what you want out of life by wanting
the right things and knowing how to get them.
Many people go through life unconsciously following societal or
mimetic desires. He emphasizes the importance of thinking things through
for oneself rather than blindly following others.
Staying too long in bad situations (relationships, jobs) is a common
regret.
We are naturally hardwired for pessimism due to evolutionary
pressures to avoid ruin.
Humans are dynamic and labels like optimist, pessimist, introvert,
extrovert are self-limiting.
Overthinking about oneself can lead to misery; focusing on something
bigger can bring happiness. Overthinking and rumination do not help with
happiness.
There are no universal formulas for success or happiness; each
person's path is unique.
Lasting change comes from desire and understanding, not forcing
oneself. He suggests aligning actions with genuine wants for maximal
effectiveness.
Anxiety often stems from having many unresolved and conflicting
desires.
Our interpretations of experiences shape our reality. The same
experience can lead to different emotional responses based on individual
interpretation.
The "gut" is the ultimate decision-maker, representing refined
judgment accumulated through evolution and experience. He advises
trusting this instinct once it's developed.
You cannot change other people, only your reaction to them. He adds
that people change through their own insights or trauma, not by being
told to.
Negative reinforcement is less effective than positive reinforcement
in changing behavior.
If faced with a difficult choice and unable to decide, the answer is
often "no." He also suggests that when choosing between two equal
options, take the more painful path in the short term.
Understanding is more important than discipline for mental
change.
Truth, though often painful, is constantly reflected by reality;
wisdom is the personal rediscovery and contextual application of
timeless truths. He also mentions that many important life lessons are
"unteachable" in the sense that they must be experienced firsthand to be
truly understood.
Memorization is becoming less valuable in the age of readily
available information; understanding, judgment, and taste are more
crucial. He links understanding to solving real problems and finding
generalizable truths.
Philosophy evolves with new knowledge and perspectives. He explains
how advancements in science and technology lead to different
philosophical outlooks, and even moral philosophy progresses over
time.
Many philosophical paradoxes can be resolved by considering
different scales and timeframes. Naval suggests that seemingly
contradictory questions like free will and determinism can be understood
by shifting perspectives.
Coordination is essential for societal function; pure libertarianism
is unsustainable.
Modern AI, while powerful, currently lacks true creativity and deep
understanding.
Meaning can be more important than moment-to-moment happiness.
In an age of news saturation, it's a battle to maintain focus on
what truly matters and what one can influence. He emphasizes that
attention is the real currency of life and should be spent consciously.
Attention, not time or money, is the most fundamental resource in
life.
Getting past one's past is a skill achieved by processing it to be
rid of it, not to dwell on it.
I think agents are real, but I think that we are far away from
that because we're still at the phase of how do you build reliable
software in production for an enterprise versus the toy apps that you
see on the internet which is like let me vibe code something. I think
these things are worlds apart still. - Chamath Palihapitiya
I think we have not yet figured out how to move the budgets from
experimentation to mainline production. Meaning where large chunks of
the US economy are comfortable enough with the ways in which
hallucinations are managed such that they will replace legacy
deterministic code with this new probabilistic model generated code
meaning model enabled code. - Chamath Palihapitiya
― Trump's First 100 Days, Tariffs Impact Trade, AI Agents,
Amazon Backs Down - All-In Podcast [Link]
Papers and Reports
Orchestrating Agents and Data for Enterprise: A Blueprint
Architecture for Compound AI [Link]
This paper contributes to the enterprise AI landscape by offering a
comprehensive architectural blueprint for deploying agentic, modular,
and data-integrated AI systems that can efficiently leverage LLMs and
enterprise assets.
Github
Google Gemini 2.0 with MCP (Model Context Protocol) Servers -
Gemini Samples [Link]
Maestro - A Framework for Claude Opus, GPT and local LLMs to
Orchestrate Subagents - maestro [Link]
Accelerate Generalist Humanoid Robot Development with NVIDIA
Isaac GR00T N1 - NVIDIA [Link]
Announcing the Agent2Agent Protocol (A2A) - Google for
Developers [Link]
Key Takeaways:
A2A is an open-source protocol backed by 50+ tech giants (e.g.,
Salesforce, SAP, Cohere) and consultancies (e.g., Accenture, Deloitte).
It allows agents from different vendors/frameworks to communicate, share
data, and coordinate tasks without being locked into a single
platform.
Solving Enterprise Pain Points: Breaks down silos by letting agents
interoperate across HR (Workday), CRM (Salesforce), ERP (SAP), and other
systems. Example: A hiring manager’s agent can autonomously source
candidates, schedule interviews, and run background checks by
collaborating with specialized agents.
How to Build a Graph RAG App - Steve Hedden [Link]
Walking through building a graph rag app that improves LLM accuracy
using knowledge graphs. It covers data preparation, search refinement
with MeSH terms, and article summarization.
We believe that, in 2025, we may see the first AI agents “join
the workforce” and materially change the output of companies. We
continue to believe that iteratively putting great tools in the hands of
people leads to great, broadly-distributed outcomes.
This guide provides a comprehensive exploration of AI-powered agents,
focusing on their capabilities, planning, tool selection, and failure
modes. It delves into the factors determining an agent's performance,
how LLMs can plan, and how to augment planning capabilities. It also
provides insights into agent failures and how to evaluate them
effectively.
The Batch Issue 284 - DeepLearning.AI - Andrew Ng
[Link]
Andrew Ng highlights AI Product Management’s growth as software
becomes cheaper to build.
Global-batch load balance almost free lunch to improve your
MoE LLM training - Qwen [Link]
MoE models struggle with expert underutilization due to
micro-batch-level load balancing, which fails when data within a batch
lacks diversity. This results in poor expert specialization and model
performance.
The paper proposes global-batch load balancing, where expert
selection frequencies are synchronized across all parallel groups,
ensuring more effective domain specialization and improved
performance.
Global-batch load balancing outperforms micro-batch balancing in all
tested configurations. It shows improved performance and expert
specialization, with models achieving better results across various data
sizes and domains.
How to Evaluate LLM Summarization - Isaac Tham [Link]
A quantitative, research-backed framework for evaluating LLM
summaries, focusing on conciseness and coherence. This guide explores
challenges in summarization evaluation, defines key quality metrics
(conciseness, coherence), and improves the Summarization Metric in the
DeepEval framework. Includes a GitHub notebook for applying these
methods to assess summaries of long-form content systematically.
We just gave sight to smolagents - HuggingFace [Link]
This tutorial is about how to integrate vision capabilities into
autonomous agents using smolagents. It explains passing images to agents
in two ways: at initialization or dynamically via callbacks. It
demonstrates building a web-browsing agent with vision using the
MultiStepAgent class and helium. The agent performs actions like
navigation, popup handling, and dynamic webpage analysis.
On DeepSeek and Export Controls - Dario Amodei [Link]
Highlighting export controls' impact on AI geopolitics.
Hugging Face challenges OpenAI’s Deep Research with an open-source
alternative, beating previous SOTA by 9 points.
Choosing the Right AI Agent Framework: LangGraph vs CrewAI vs
OpenAI Swarm - Yi Zhang [Link]
Compare LangGraph, CrewAI, and OpenAI Swarm frameworks for building
agentic applications with hands-on examples. Understand when to use each
framework, and get a preview of debugging and observability topics in
Part II.
Learn how to scale LLMs on TPUs by understanding hardware
limitations, parallelism, and efficient training techniques. Explore how
to estimate training costs, memory needs, and optimize performance using
strategies like data, tensor, pipeline, and expert parallelism. Gain
hands-on experience with LLaMA-3, and learn to profile and debug your
code.
Sam outlines AI trends: AI’s scaling limits, cost reduction, and the
future of autonomous agents.
How to deploy and fine-tune DeepSeek models on AWS -
HuggingFace [Link]
Deploy and fine-tune DeepSeek-R1 models on AWS using Hugging Face
with GPUs, SageMaker, and EC2 Neuron.
Building a Universal Assistant to connect with any API -
Pranav Dhoolia [Link]
Convert any OpenAPI spec into an MCP-compatible API assistant without
writing custom integration code. Use a generic MCP server to expose API
endpoints dynamically. This approach simplifies integration, expands
compatibility, and makes scaling API support more efficient.
From PDFs to Insights: Structured Outputs from PDFs with
Gemini 2.0 - Philschmid [Link]
Learn to convert PDFs into structured JSON using Gemini 2.0. Set up
the SDK, process files, manage tokens, and define JSON schemas with
Pydantic. Covers real-world examples like invoices and forms, best
practices, and cost management, works within the free tier.
The Hidden Ways We Really Work Together - Microsoft
[Link]
Managing LLM implementation projects - Piotr
Jurowiec [Link]
Discover how to implement LLMs from initial planning to deployment.
Establish project goals, select suitable architectures, preprocess data,
train and evaluate models, optimize hyperparameters, and incorporate
domain expertise. Tackle challenges such as hallucinations, security
risks, regulatory compliance, and scalability limitations. Develop
systematic workflows for building and managing LLM-based
applications.
How to build a ChatGPT-Powered AI tool to learn technical
things fast - AWS [Link]
What Problem Does The Model Context Protocol Solve? -
AIhero [Link]
Learn how the Model Context Protocol (MCP) simplifies integrating
large language models (LLMs) with external APIs.
MCP acts as a connector between LLMs and external data sources,
facilitating interactions with tools without requiring LLMs to
understand intricate APIs. By providing a standardized interface, it
streamlines integrations with platforms like GitHub, enhancing workflow
speed and efficiency.
Most AI value will come from broad automation, not from
R&D - Epoch AI [Link]
Epoch AI's article argues against the popular notion that the primary
economic benefit of artificial intelligence will stem from its
application in research and development. Instead, the authors posit that
AI's most significant value will arise from its widespread deployment in
automating existing labor across various sectors.
Substack
Tencent: Betting Big on AI - App Economy Insights
[Link]
tencent_corporate_overview
Tencent's proprietary HunYuan framework has developed into a central
AI platform catering to both consumers and enterprises. Originally
centered on text and conversational AI, HunYuan has expanded to support
multimodal capabilities, including image, video, and 3D generation,
where it has attained top rankings in industry benchmarks.
HunYuan_Thesis
Tencent has a Dual-Core AI strategy: It combines its proprietary T1
model with external AI, such as DeepSeek’s R1, in a “double-core”
approach. Yuanbao chatbot utilizes both—T1 for deep reasoning and R1 for
quick responses—while WeChat Search enhances accuracy by integrating T1
with DeepSeek.
Alphabet has completed its largest acquisition to date with a $32
billion deal to acquire cloud security startup Wiz. If successful, this
move could redefine GCP’s security portfolio, strengthening its stance
as AI-driven cloud computing becomes the focal point.
google_biggest_acquisition
Papers and Reports
Whitepaper Agents - Authors: Julia Wiesinger, Patrick Marlow
and Vladimir Vuskovic [Link]
Google’s whitepaper explains how AI agents use reasoning, tools, and
external data to automate tasks, turning large language models (LLMs)
into workflow automation systems. Google suggests using
LangChain for prototyping and Vertex
AI for scaling production-ready agents. its framework provides
a standardized approach to ensure reliable AI agent execution.
Key Components
Decision Engine – The LLM plans and executes tasks
using reasoning methods like ReAct or Chain-of-Thought.
Tool Integration – Agents interact with APIs,
databases, and real-time data.
Orchestration Layer – Manages task execution and
decision-making.
Tool Types
Extensions – Directly call APIs for
automation.
Functions – Allow developers to control
execution.
Data Stores – Use retrieval-augmented generation
(RAG) for external data access.
Use Cases
Agents handle tasks like personalized recommendations, workflow
automation, and database queries. For example, they can fetch a user’s
purchase history and generate tailored responses.
Introducing smolagents, a simple library to build
agents - HuggingFace [Link]
HuatuoGPT-o1, Towards Medical Complex Reasoning with
LLMs [Link]
This paper shows how to build domain-specific reasoning models using
a two-stage training process. HuatuoGPT-o1, a medical LLM, enhances
complex reasoning using this two-stage approach: (1) supervised
fine-tuning (SFT) with complex Chain-of-Thought (CoT) and (2)
reinforcement learning (RL) using a verifier to refine reasoning.
huatuogpto1
Inference-Time Scaling for Diffusion Models beyond Scaling
Denoising Steps - Google DeepMind [Link]
Google DeepMind introduces noise search method, outperforming
traditional denoising in diffusion models.
Chain of Agents: Large language models collaborating on
long-context tasks - Google Research [Link]
[Paper]
SFT Memorizes, RL Generalizes: A Comparative Study of
Foundation Model Post-training - Google DeepMind [Link] [Link]
Explaining why reinforcement learning outperforms supervised
fine-tuning for model generalization.
Learning to Plan & Reason for Evaluation with
Thinking-LLM-as-a-Judge [Link]
A new preference optimization algorithm for LLM-as-a-Judge
models.
Hallucination Mitigation using Agentic AI Natural
Language-Based Frameworks [Link]
Generative AI models often produce hallucinations, making them less
reliable and reducing trust in AI systems. In this work, a multi-agent
system is designed using over 300 prompts to induce hallucinations. AI
agents at different levels review and refine outputs using distinct
language models, structured JSON communication, and the OVON framework
for seamless interaction. New KPIs are introduced to measure
hallucination levels.
ELEGNT: Expressive and Functional Movement Design for
Non-Anthropomorphic Robot - Apple [Link]
[Link]
This is very cool.
π0 and π0-FAST: Vision-Language-Action Models for General
Robot Control - Hugging Face [Link]
Hugging Face publishes the first open-source robotics foundation
models for real-world applications.
Claude 3.7 Sonnet introduces extended thinking, visible reasoning,
and improved agentic capabilities for complex tasks.
Can LLMs Generate Novel Research Ideas? A Large-Scale Human
Study with 100+ NLP Researchers [Link]
This study evaluated the ability of LLMs to generate novel,
expert-level research ideas compared to human experts by recruiting over
100 NLP researchers for idea generation and blind reviews. Results
showed that LLM-generated ideas were rated as more novel than human
ideas (p < 0.05) but slightly less feasible. While LLMs demonstrated
promising ideation capabilities, challenges such as limited idea
diversity and unreliable self-evaluation were identified, highlighting
areas for improvement in developing effective research agents.
Yann LeCun and his team have proposed Dynamic Tanh (DyT) as an
alternative to conventional normalization layers in deep learning
models. This innovative method, leveraging the scaled tanh function,
delivers performance on par with or superior to techniques like
LayerNorm and RMSNorm. Notably, its ability to lower computational costs
while preserving model efficiency makes it particularly compelling.
Some articles mentioned: "Retrieval-Augmented Generation with
Knowledge Graphs for Customer Service Question Answering" [Link], and "GraphRAG:
Unlocking LLM discovery on narrative private data" [Link].
Building a fully local "deep researcher" with DeepSeek-R1 -
LangChain [Link]
This tutorial reviews DeepSeek R1's training methods, explains
downloading the model via Ollama, and demonstrates JSON-mode testing.
Test its local "deep research" assistant, which performs web research
and iterative summarization with reflection for improved results.
Building Effective Agents with LangGraph - LangChain
[Link]
This video shows the difference between agents and workflows and when
to use each. You'll Implement patterns like prompt chaining,
parallelization, and routing using LangGraph. The session covers
building agents, applying advanced patterns, and understanding how
LangGraph enhances automation and optimization in AI systems.
Nvidia's GTC 2025 Keynote: Everything Announced in 30 Minutes
- Amrit Talks [Link]
Satya Nadella discusses AI, AGI skepticism, economic growth, quantum
computing, AI pricing, gaming models, and legal challenges. Notes of
insights impressed me:
Nadella believes hyperscalers (like Microsoft Azure, AWS, and Google
Cloud) will be major beneficiaries of AI advancements. The exponential
growth in compute demand for AI workloads—both for training and
inference—will drive massive infrastructure needs. Hyperscalers are
well-positioned to meet this demand due to their ability to scale
compute, storage, and AI accelerators efficiently.
He argues that hyperscale infrastructure is not a winner-takes-all
market. Enterprises and corporations prefer multiple suppliers to avoid
dependency on a single vendor. This structural dynamic ensures
competition and prevents monopolization.
While there may be a few dominant closed-source AI models, Nadella
predicts that open-source alternatives will act as a check, preventing
any single entity from monopolizing the AI model space. He draws
parallels to the coexistence of closed-source (e.g., Windows) and
open-source systems in the past.
He highlights that governments worldwide are unlikely to allow
private companies to dominate AI entirely. Regulatory and state
involvement will likely shape the landscape, further preventing a
winner-takes-all scenario.
In consumer markets, network effects can lead to winner-takes-all
dynamics (e.g., ChatGPT's early success). However, in enterprise
markets, multiple players will thrive across different categories.
He disagrees with the notion that AI models or cloud infrastructure
will become commoditized. At scale, the complexity of managing
hyperscale infrastructure and the know-how required to optimize it
create significant barriers to entry and sustain profitability.
Microsoft aims to build a versatile hyperscale fleet capable of
handling large training jobs, inference workloads, and specialized tasks
like reinforcement learning (RL). The company focuses on distributed
computing, global data center placement, and high utilization of
resources to meet diverse AI demands.
Nadella envisions a future where AI agents and specialized models
will drive even greater compute demand. He emphasizes the importance of
building infrastructure that can support both training and inference at
scale, while also accommodating evolving AI research and
development.
Microsoft Research (MSR) has a history of investing in fundamental,
curiosity-driven research, often with no immediate payoff. Nadella
emphasizes the importance of maintaining this culture, even if the
benefits may only materialize decades later. Nadella highlights the
difficulty of transitioning from research breakthroughs to scalable
products. The role of leadership is to ensure that innovations are not
only technically sound but also commercially viable.
Nadella envisions quantum computing being accessed via APIs, similar
to how cloud services are used today. This could democratize access to
quantum capabilities for research and industry.
Good course to help you to align LLM with specific use cases. It
includes instruction tuning, preference alignment using DPO/ORPO, LoRA,
prompt tuning, and multimodal model adaptation, and it covers creating
synthetic datasets, evaluation, and efficient inference.
This repository provides access to a selection of curated Model
Context Protocol (MCP) servers designed for seamless AI model-resource
interaction. It features both production-ready and experimental servers,
offering capabilities like file access, database connections, and API
integrations. There are frameworks, tutorials, and practical tips to
enhance model deployment and maximize resource efficiency in real-world
applications.
I just finished reading "Daring Greatly: How the Courage to Be
Vulnerable Transforms the Way We Live, Love, Parent, and Lead" by Brené
Brown. This is the second book of hers I have read. Her words feel like
whispers from God.
What she says about wholeheartedness:
“Wholehearted living is about engaging in our lives from a place of
worthiness. It means cultivating the courage, compassion, and connection
to wake up in the morning and think, No matter what gets done and how
much is left undone, I am enough. It's going to bed at night thinking,
Yes, I am imperfect and vulnerable and sometimes afraid, but that
doesn't change the truth that I am also brave and worthy of love and
belonging.”
What she says about vulnerability:
“Vulnerability is based on mutuality and requires boundaries and
trust. It's not oversharing, it's not purging, it's not indiscriminate
disclosure, and it's not celebrity-style social media information dumps.
Vulnerability is about sharing our feelings and our experiences with
people who have earned the right to hear them. Being vulnerable and open
is mutual and an integral part of the trust-building process.”
“If we're going to find our way out of shame and back to each other,
vulnerability is the path and courage is the light. To set down those
lists of what we're supposed to be is brave. To love ourselves and
support each other in the process of becoming real is perhaps the
greatest single act of daring greatly.”
What she says about perfectionism: (I love this part!)
“The problem was thankfully never fixed, and in time the box
overflowed as more and more art piled up. I think the dilemma exists
because art, among all the other tidy categories, most closely resembles
what it is like to be human. To be alive. It is our nature to be
imperfect. To have uncategorized feelings and emotions. To make or do
things that don't sometimes necessarily make sense.
Art is all just perfectly imperfect.
My fixation with these words from Leonard Cohen's song "Anthem" comes
from how much comfort and hope they give me as I put "enough" into
practice: "There's a crack in everything. That's how the light gets
in."”
What she says about oversharing:
“It's an important question, and the answer is that I don't tell
stories or share vulnerabilities with the public until I've worked
through them with the people I love. I have my own boundaries around
what I share and what I don't share and I stay mindful of my
intentions.
First, I only share stories or experiences that I've worked through
and feel that I can share from solid ground. I don't share what I define
as "intimate" stories, nor do I share stories that are fresh wounds.
Second, I follow the rule that I learned in my graduate social work
training. Sharing yourself to teach or move a process forward can be
healthy and effective, but disclosing information as a way to work
through your personal stuff is inappropriate and unethical.
Last, I only share when I have no unmet needs that I'm trying to
fill. I firmly believe that being vulnerable with a larger audience is
only a good idea if the healing is tied to the sharing, not to the
expectations I might have for the response I get.
What she says about disengagement:
“Disengagement is the issue underlying the majority of problems I see
in families, schools, communities, and organizations and it takes many
forms, including the ones we discussed in the "Armory" chapter. We
disengage to protect ourselves from vulnerability, shame, and feeling
lost and without purpose. We also disengage when we feel like the people
who are leading us—our boss, our teachers, our principal, our clergy,
our parents, our politicians-aren't living up to their end of the social
contract.”
“The gap starts here: We can't give people what we don't have. Who we
are matters immeasurably more than what we know or who we want to be.
The space between our practiced values (what we're actually doing,
thinking, and feeling) and our aspirational values (what we want to do,
think, and feel) is the value gap, or what I call "the disengagement
divide." It's where we lose our employees, our clients, our students,
our teachers, our congregations, and even our own children.”
What she says about vulnerabilities in Sales:
“My answer was no. And yes. In that scenario vulnerability is
recognizing and owning that you don't know something; it's looking the
customer in the eye and saying, "I don't know the answer to that, but
I'll find out. I want to make sure you have the correct information." I
explained that the unwillingness to engage with the vulnerability of not
knowing often leads to making excuses, dodging the question,
or-worst-case scenario-bullshitting. That's the deathblow in any
relationship, and the one thing I've learned from talking to people who
sell for a living is that sales is all about relationships.”
And her Daring Greatly Leadership Manifesto:
“To the CEOs and teachers. To the principals and the managers. To the
politicians, community leaders, and decision makers:
We want to show up, we want to learn, and we want to inspire.
We are hardwired for connection, curiosity, and engagement.
We crave purpose, and we have a deep desire to create and
contribute.
We want to take risks, embrace our vulnerabilities, and be
courageous.
When learning and working are dehumanized, when you no longer see us
and no longer encourage our daring, or when you only see what we produce
or how we perform, we disengage and turn away from the very things that
the world needs from us: our talent, our ideas, and our passion.
What we ask is that you engage with us, show up beside us, and learn
from us.
Feedback is a function of respect; when you don't have honest
conversations with us about our strengths and our opportunities for
growth, we question our contributions and your commitment.
Above all else, we ask that you show up, let yourself be seen, and be
courageous. Dare Greatly with us.”
The key aspect of managing up is to learn to speak the language
of your counterpart. If you can speak their language you can understand
their goals and fears, and you can communicate at the level they are.
You'll be in a better position to be an effective report.— Umberto
Nicoletti, Head of R&D at Proemion
The better we understand the goals that our managers have, the
less surprising their actions will be. […] Some of the situations where
managers act in ways that most dismay or surprise us are when they are
acting on their fears and worries. - Joe Chippindale, CTO
Coach
― Frameworks for Managing Up as a Software Engineer - High
Growth Engineer [Link]
Building Trust:
Sincerity — you are honest and transparent, even when it’s
uncomfortable. This includes admitting mistakes early, being upfront
with challenges, and sharing both good and bad news, without
sugar-coating the latter.
Reliability — this is about consistency and following through. You
do what you say you'll do, you set realistic expectations, and
communicate proactively through regular update habits. More on this
later in the updates section.
Care — you have their best interests in mind. This means
understanding their goals and challenges, being proactive in helping
them succeed, and showing empathy when things get tough.
Competence — finally, you deliver results. This goes beyond
technical skills: it's about delivering business value, learning and
growing from feedback, and understanding the big picture.
Speaking their language:
Map their context
What makes you successful? — What are your goals and
concerns?
What makes me successful? — How can I help you reach your
goals?
The only way for you to be successful is to make your
manager successful. To do that, you need to be able to map your
goals and concerns into their own.
Translate impact across altitudes
For any item you report to your manager, the question you should ask
yourself is: why should my manager care about this? And, more
subtly: what does my manager care about this?
Create explicit agreements
Scope of ownership — do you know what decisions you can make
autonomously vs when you need to involve your manager?
Success criteria — how do you know if what you do is successful? Do
you know how impact will be measured?
Mutual expectations — do you know what your manager needs from you?
And do they know what you need from them?
Creating effective updates
Define your update stack
Async messages (daily) — about significant progress or
blockers.
Written reports (weekly) — structured updates about key results and
next steps.
1:1s (weekly or biweekly) — deeper conversations about growth,
wellbeing, and strategy.
Make every update count
Why does this matter to my manager?
What should they do with this information?
Build a feedback loop
Use 1:1s, retrospectives, and feedback moments to inspect your update
process: what's working? What feels like noise? What critical
information is missing?
JD Vance's AI Summit Paris Speech - Artificial Intelligence
Survey [Link]
[YouTube]
Here are some of JD Vance's main points regarding AI, on behalf of
the Trump Administration:
Vance emphasizes AI's potential for revolutionary applications and
economic innovation and advocates against being too risk-averse. This is
the main stance of this optimistic speech - more of AI opportunity, less
of AI safety.
He states the administration aims to ensure American AI technology
remains the gold standard and the U.S. is the preferred partner for AI
expansion. The U.S. wants to partner with other countries in the AI
revolution with openness and collaboration, but this requires
international regulatory regimes that foster creation rather than
strangling it.
He expresses concern that excessive regulation could stifle the AI
industry and supports a deregulatory approach. He mentions the
development of an AI action plan that avoids overly precautionary
regulatory regimes, while ensuring that all Americans benefit from the
technology and its transformative potential. The administration is
troubled by foreign governments tightening regulations on U.S. tech
companies with international footprints. Vance states that preserving an
open regulatory environment has encouraged American innovators to
experiment.
He stresses that American AI should not be co-opted for
authoritarian censorship and should be free from ideological bias.
He notes the importance of building the most powerful AI systems in
the U.S. with American-designed and manufactured chips.
He believes AI should be a tool for job creation and making workers
more productive, prosperous, and free. The administration will always
center American workers in its AI policy and ensure that AI makes
workers more productive. For all major AI policy decisions coming from
the federal government, the Trump Administration will guarantee American
workers a seat at the table.
ai_action_submit_jd_tweets
Elon Musk Blocked a Bill to Stop Amazon from Helping Kids
Kill Themselves - BIG by Matt Stoller [Link]
In December, Elon Musk pushed for the reduction of government funding
legislation, which led to the removal of several provisions. One
provision removed due to Musk's intervention was the Youth
Poisoning Prevention Act, which would have prevented consumers
from buying concentrated sodium nitrite, a chemical often used in
teenage suicides. This chemical, while used in low concentrations as a
food preservative, is lethal in high concentrations and has no household
uses.
The article highlights that Musk, who has significant political
power, can make harmful mistakes, sometimes unknowingly. The author
notes that the removal of the provision was considered a mistake that
could be fixed. Despite bipartisan support for the priorities, there has
been no action taken to reinstate them. He questions whether anyone will
address and rectify the issues that arise from actions taken by figures
like Musk and Trump.
Deep Research, information vs. insight, and the nature of
science - Interconnects [Link]
This is a very interesting point: the article considers how AI might
challenge Thomas Kuhn's theories of scientific revolutions. Kuhn's
The Structure of Scientific Revolutions describes how science
evolves, with scientists forming paradigms around
theories and using them to gain knowledge until limitations necessitate
a new paradigm. Here's how AI might challenge Kuhn's theories:
AI is accelerating scientific progress, potentially faster than
paradigms can be established. The fundamental unit of scientific
progress is reducing so quickly that it redefines experimentation
methods.
Kuhn emphasizes that scientific knowledge is a process, not a set of
fixed ideas. AI's emergence challenges this.
Kuhn suggests science is done by a community that slowly builds out
the frontier of knowledge, rather than filling in a known space. The
article questions how the dynamics of science will change with AI
systems.
Kuhn states that to reject one paradigm requires the simultaneous
substitution of another. The article implies that AI's rapid
advancements may disrupt this process.
Check out this
impressive list of stories they’ve broken since Trump took
office:
2025: the Year of Datacenter Mania - AI Supremacy
[Link]
This is an overview of what's happening and going to happen around
Data Center Construction, covering a wide range of areas.
AI Expansion and Energy Demand:
The AI race is intensifying, leading to significant capital
expenditure by Big Tech and raising concerns about potential harmful
consequences. AI data centers' power demands are rapidly increasing,
with estimates of needing 10 gigawatts of additional capacity in 2025
alone.
Goldman Sachs Research projects a 165% increase in data center power
demand by 2030. By 2027, global AI data center power demand could reach
68 GW and 327 GW by 2030, compared to a total global data center
capacity of 88 GW in 2022. Training AI models could require up to 1 GW
in a single location by 2028 and 8 GW by 2030.
Infrastructure and Logistical Challenges:
Power infrastructure delays are increasing wait times for grid
connections, which can take four to seven years in key regions. Data
centers face struggles with local and state permits, especially for
backup generators and environmental impact assessments.
A lack of data center infrastructure in the U.S. could cause a shift
of construction to other countries. Countries with greater compute
access may gain economic and military advantages.
Environmental and Health Concerns:
There are growing concerns that the impact of data centers on human
health is being overlooked, and one of President Biden's executive
orders acknowledges that data centers are harmful to health.
The environmental cost of AI includes concerns about water
consumption, air pollution, electronic waste, and critical materials, in
addition to public health concerns around pollution.
Energy Solutions and the Nuclear Option:
To meet AI’s growing power needs, some experts advocate for nuclear
energy as the most viable long-term solution. Nuclear energy produces no
carbon emissions during operation and offers a reliable, constant energy
supply. Tech giants like Microsoft and Google are recognizing nuclear
energy’s potential, with Microsoft exploring small modular reactors
(SMRs). The adoption of nuclear energy faces obstacles such as high
upfront costs, regulatory hurdles, and public skepticism.
Global AI Race and Investments:
The EU is mobilizing $200 billion in AI investments, signifying a
global race for AI leadership. The UAE is investing billions in AI data
centers in France and is implicated, along with SoftBank and Oracle, in
OpenAI's data center project in Abilene, Texas.
The Question of Sustainability:
AI's rapid expansion is testing the limits of power infrastructure,
natural resources, and sustainability efforts. If AI continues to expand
at its current rate, there is a risk of a gridlocked future limited by
energy availability. The future of AI depends on sustainability and the
willingness to sacrifice energy for intelligence.
Amazon is planning to invest over $100 billion in 2025, primarily in
AI-related infrastructure. This is more than any other company, and a
20% increase from 2024.
AWS revenue grew 19% Y/Y, and roughly half of the growth is
attributed to AI. AWS has a 30% market share in cloud infrastructure.
Amazon is focused on custom silicon (Trainium and Inferentia) to improve
AI efficiency.
The de minimis exemption, which allows imports
under $800 to avoid US tariffs, gives companies like Shein and Temu a
competitive edge. Amazon Haul was launched last year to
compete directly with these companies. Should the de minimis loophole be
eliminated, Amazon's superior logistics network could give it an
advantage in fulfillment and reliability.
Amazon Prime's multi-faceted membership is highly
effective at reducing churn. A 2022 study
by the National Research Group found that Prime has one of the
lowest churn rates, second only to cloud storage and music
streaming services. Amazon's detailed purchase data provides advertisers
with a valuable advantage, enabling highly targeted CTV ads with
industry-leading returns on ad spend (ROAS).
Uber’s Three-Pronged AV Strategy:
Fleet partnerships: Uber isn’t building its own
AVs. Instead, it partners with companies like Waymo, Motional, and
Aurora, integrating their fleets into Uber’s network.
Hybrid model: AVs can’t handle all trips—human
drivers will fill gaps, handling extreme weather, complex routes, and
peak hours for decades.
Fleet infrastructure: Uber is investing in
charging depots and fleet management to maximize AV asset
utilization.
While Tesla is vertically integrated, its rideshare strategy may
take a different path. If Tesla adopts an asset-light model, Tesla
owners—not Tesla itself—would decide whether to list their AVs on Uber.
If maximum utilization is the goal, Uber could be the logical
choice.
When it comes to demand aggregation, Uber remains the undisputed
leader—its network effects ensure that as long as it aggregates supply,
demand will follow, and gross profit will scale.
While the rideshare market will become more fragmented, Uber
could still be the biggest fish in a much larger pond. After all, Uber
is already the Airbnb for cars.
Tesla has a massive opportunity once the pieces fall into place.
But with auto sales under pressure and market share declining, it still
faces a long road ahead before claiming the top spot in any
market.
― Tesla vs. Uber: Collision Course? - App Economy
Insights [Link]
Uber's business model is one of my favorite business models. Not only
because it's asset-light, its network effect, etc, but also because it
created millions of jobs.
Uber's AV strategy is designed to balance innovation with
practicality, ensuring that the company remains competitive while
minimizing risks and costs. By leveraging partnerships, maintaining a
hybrid model, and investing in infrastructure, Uber is well-positioned
to lead the transition to autonomous mobility.
While a partnership between Uber and Tesla is possible and could
offer significant synergies, it is not guaranteed. The decision would
depend on whether both companies can align their goals and overcome
competitive tensions. If Tesla decides to prioritize its own
ride-hailing network (Tesla Network), it may choose to compete rather
than collaborate with Uber. However, if Tesla sees more value in
leveraging Uber’s platform and customer base, a partnership could be a
strategic move for both companies.
uber_investor_pre
Microsoft: AI Efficiency Paradox - App Economy
Insights [Link]
The End of Search, The Beginning of Research - One Useful
Thing [Link]
Huang’s take:“We've really only tapped
consumer AI and search and some amount of consumer generative AI,
advertising, recommenders, kind of the early days of software. […]
Future reasoning models can consume much more compute.”
DeepSeek-R1, he said, has “ignited global enthusiasm” and will
push reasoning AI into even more compute-intensive
applications.
Huang introduced a framework for AI’s evolving compute demands,
outlining three scaling laws:
Pre-training scaling: Traditional model growth
through data consumption, now enhanced by multimodal learning and
reasoning-based data.
Post-training scaling: The fastest-growing compute
demand, driven by reinforcement learning from human and AI feedback.
This phase now exceeds pre-training in compute usage due to the
generation of synthetic data.
Inference & reasoning scaling: The next major
shift, where AI engages in complex reasoning (e.g., chain-of-thought,
search). Inference already requires 100x more compute than early LLMs
and could scale to millions of times more.
Jensen Huang outlined a three-layer AI transformation across
industries:
Agentic AI (Enterprise AI): AI copilots and
automation tools boosting productivity in sectors like automotive,
finance, and healthcare.
Physical AI (AI for Machines): AI-driven training
systems for robotics, warehouses, and autonomous vehicles.
Robotic AI (AI in the Real World): AI enabling
real-world interaction and navigation, from self-driving cars to
industrial robots.
Grab: The Uber Slayer - App Economy Insights [Link]
DeepSeek isn’t a threat—it’s validation. If AI
inference costs are falling, Meta stands to benefit more than almost any
other company. Instead of challenging its strategy, DeepSeek reinforces
that heavy AI investments will pay off—not the other way
around.
Elon Musk and spiky intelligence - Silver Bulletin
[Link]
Interesting study on spiky intelligence, using Elon as a case study.
Concepts highlights:
Spiky Intelligence: This refers to individuals who
exhibit exceptional abilities in certain areas while being deficient in
others. It contrasts with the idea of general intelligence (the "g
factor"), where most cognitive abilities are positively correlated.
Spiky intelligence is often seen in people who excel in abstract,
analytical reasoning but may lack emotional intelligence, empathy, or
practical judgment.
Berkson’s Paradox: This statistical phenomenon
explains why successful individuals often appear to have significant
weaknesses. In highly competitive fields, it’s rare to find people who
excel in all dimensions, so success often goes to those with a few
standout traits.
beerkson_paradox_with_selection_effects
YouTube and Podcasts
DOGE vs USAID, Crypto Framework, Google's $75B AI Spend, US
Sovereign Wealth Fund, GLP-1s - All-In Podcast [Link]
DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI
Megaclusters - Lex Fridman Podcast #459 [Link] [Transcript]
This is a very good one. 5 hours intro and overview of current AI
landscape.
Those driver jobs weren't even there 10 years ago. Uber came
along and created all these driver jobs. DoorDash created all these
driver jobs. So what technology does—yes, technology destroys jobs—but
it replaces them with opportunities that are even better. And then,
either you can go capture that opportunity yourself, or an entrepreneur
will come along and create something that allows you to capture those
opportunities. AI is a productivity tool. It increases the productivity
of a worker; it allows them to do more creative work and less repetitive
work. As such, it makes them more valuable. Yes, there is some
retraining involved, but not a lot. These are natural language
computers—you can talk to them in plain English, and they talk back to
you in plain English. But I think David is absolutely right. I think we
will see job creation by AI that will be as fast or faster than job
destruction. You saw this even with the internet. Like, YouTube came
along—look at all these YouTube streamers and influencers. That didn’t
used to be a job. New jobs—really, opportunities—because 'job' is the
wrong word. 'Job' implies someone else has to give it to me, like
they're handed out, as if it's a zero-sum game. Forget all that—it's
opportunities. After COVID, look at how many people are making money by
working from home in mysterious little ways on the internet that you
can't even quite grasp. - Naval Ravikant
you know as long as you remain adaptive and you keep learning and
you learn how to take advantage of these tools you should do better and
if you wall yourself off from the technology and don't take advantage of
it that's when you put yourself at risk - David Sacks
If you trained on the open web, your model should be open
source. – Naval Ravikant.
To keep the conversation moving, let me segue a point that came
up that was really important into tariffs. And the point is, even though
the internet was open, the U.S. won a lot of the internet—a lot of U.S.
companies won the internet. And they won that because we got there "the
firstest with the mostest," as they say in the military. And that
matters because a lot of technology businesses have scale economies and
network effects underneath, even basic brand-based network effects. If
you go back to the late '90s and early 2000s, very few people would have
predicted that we would have ended up with Amazon basically owning all
of e-commerce. You would have thought it would have been perfect
competition and very spread out. And that applies to how we end up with
Uber as basically one taxi service or how we end up with Airbnb.
Meta—Airbnb—it's just network effects, network effects, network effects
rule the world around me. But when it comes to tariffs and when it comes
to trade, we act like network effects don't exist. The classic Ricardian
comparative advantage dogma says that you should produce what you're
best at, I produce what I'm best at, and we trade. And then, even if you
want to charge me more for it—if you want to impose tariffs for me to
ship to you—I should still keep tariffs down because I'm better off.
You're just selling me stuff cheaply—great. Or if you want to subsidize
your guys—great, you're selling me stuff cheaply. The problem is, that
is not how most modern businesses work. Most modern businesses have
network effects. As a simple thought experiment, suppose that we have
two countries, right? I'm China, you're the U.S. I start out by
subsidizing all of my companies and industries that have network
effects. So I'll subsidize TikTok, I'll ban your social media but push
mine. I will subsidize my semiconductors, which tend to have
winner-take-all dynamics in certain categories. Or I'll subsidize my
drones and then, exactly—BYD, self-driving, whatever. And then, when I
win, I own the whole market and I can raise prices. And if you try to
start up a competitor, it's too late—I've got network effects. Or if
I've got scale economies, I can lower my price to zero, crash you out of
business, no one in their right mind will invest, and then I'll raise
prices right back up. So you have to understand that certain industries
have hysteresis, or they have network effects, or they have economies of
scale—and these are all the interesting ones. These are all the
high-margin businesses. So in those, if somebody is subsidizing or
they're raising tariffs against you to protect their industries and let
them develop, you do have to do something. You can't just completely
back down. - Naval Ravikant
I think Sam and his team would do better to leave the nonprofit
part alone, leave an actual independent nonprofit board in charge, and
then have a strong incentive plan and a strong fundraising plan for the
investors and the employees. So I think this is workable. It's just that
trying to grab it all seems way off, especially when it was built on
open algorithms from Google, open data from the web, and nonprofit
funding from Elon and others. - Naval Ravikant
― JD Vance's AI Speech, Techno-Optimists vs Doomers, Tariffs,
AI Court Cases with Naval Ravikant - All-In Podcast [Link]
"AI won't take your job; it's someone using AI that will take your
job." – Richard Baldwin. The discussion around AI's impact on jobs is
often framed as a zero-sum game, but the reality is more nuanced. While
AI will displace certain jobs (e.g., self-driving cars replacing
drivers), it will also create new opportunities and industries that we
can't yet fully envision. The key is adaptability—those who learn to use
AI tools will thrive, while those who resist will fall behind.
The Stablecoin Future, Milei's Memecoin, DOGE for the DoD,
Grok 3, Why Stripe Stays Private - All-In Podcast [Link]
How to build full-stack apps with OpenAI o1 pro - Part 1 -
Mckay Wrigley [Link]
Learn app development using OpenAI o1-Pro with a structured
six-prompt workflow.
Build and run a deep research agent with LangGraph Studio, customize
configurations, compare architectures, and analyze costs.
Paper and Reports
Probabilistic weather forecasting with machine
learning [Link]
GenCast's success stems from its ability to generate ensembles of
sharp, realistic weather trajectories and well-calibrated probability
distributions.
The methodology of GenCast involves several key components:
GenCast employs a second-order Markov assumption,
meaning it conditions its predictions on the two previous weather
states, rather than just one. This is done because conditioning on two
previous time steps works better than one.
GenCast is implemented as a conditional diffusion
model. Diffusion models are generative machine learning methods
that can model the probability distribution of complex data and generate
new samples by iteratively refining a noisy initial state. The model
predicts a residual with respect to the most recent weather state. The
sampling process begins with random noise, which is then refined over a
series of steps.
At each step of the iterative refinement process, GenCast uses a
denoiser neural network. This network is trained to
remove noise that has been artificially added to atmospheric states. The
architecture of the denoiser includes an encoder, a processor, and a
decoder. The encoder maps the noisy target state to an internal
representation on a refined icosahedral mesh, the processor is a
graph transformer, and the decoder maps the internal
mesh representation back to a denoised target state.
GenCast uses a noise distribution that respects the spherical
geometry of global weather variables. Rather than using independent and
identically distributed (i.i.d.) Gaussian noise on the
latitude-longitude grid, it samples isotropic Gaussian white
noise on the sphere and projects it onto the grid.
GenCast's performance is evaluated using various metrics, including:
CRPS (Continuous Ranked Probability Score):
Measures the skill of a probabilistic forecast.
RMSE (Root Mean Squared Error): Measures how
closely the mean of an ensemble of forecasts matches the ground
truth.
Spread/Skill Ratios and Rank Histograms: Used to
evaluate the calibration of the forecast distributions.
Brier Skill Score: Evaluates probabilistic
forecasts of binary events, specifically the prediction of extreme
weather events.
Relative Economic Value (REV): Characterizes the
potential value of a forecast over a range of probability decision
thresholds.
Spatially Pooled CRPS: Evaluates forecasts
aggregated over circular spatial regions of varying sizes to assess the
model's ability to capture spatial dependencies.
Regional Wind Power Forecasting: Evaluates the
model's ability to predict wind power generation at wind farm locations
using a standard idealized power curve.
Tropical Cyclone Track Prediction: Uses the
TempestExtremes tropical cyclone tracker to extract cyclone trajectories
from the forecast and analysis data. The model's ability to forecast
cyclone tracks is evaluated using position error and track
probability.
The United States currently leads the world in data centers and
AI compute, but unprecedented demand leaves the industry struggling to
find the power capacity needed for rapidly building new data centers.
Failure to address current bottlenecks may compel U.S. companies to
relocate AI infrastructure abroad, potentially compromising the U.S.
competitive advantage in compute and AI and increasing the risk of
intellectual property theft.
― AI's Power Requirements Under Exponential Growth -
RAND [Link]
[pdf]
Genome modeling and design across all domains of life with
Evo 2 - Arc Institute [Link]
Evo 2 is a powerful genome modeling and design tool that operates
across all domains of life. It can analyze and generate genetic
sequences from molecular to genome scale. It accurately assigns
likelihood scores to human disease variants, distinguishing between
pathogenic and benign mutations in both coding and noncoding regions. It
can predict whether genes are essential or nonessential using mutational
likelihoods, helping in bacterial and phage gene essentiality studies.
It can generate large-scale DNA sequences with structured features like
tRNAs, promoters, and genes with intronic structures. It provides
zero-shot fitness predictions for protein and non-coding RNA sequences,
correlating well with experimental fitness measurements. It robustly
predicts the pathogenicity of various mutation types, achieving
state-of-the-art performance for noncoding and splice variants.
Large Action Models: From Inception to
Implementation [Link]
Microsoft Research published one of the most complete papers in this
area, outlining a complete framework for large action models (LAMs)
models. The core idea is to bridge the gap between the language
understanding capability of LLMs and the need for real-world action
execution.
Uncovering the Impact of Chain-of-Thought Reasoning for
Direct Preference Optimization: Lessons from Text-to-SQL [Link]
Direct Preference Optimization (DPO) does not consistently improve
performance in the Text-to-SQL task and sometimes even degrades it.
Existing Standard Fine-Tuning (SFT) methods are limited by the lack of
high-quality training data, and prompting-based methods are expensive,
slow, and raise data privacy concerns.
To solve the problems, they generate synthetic CoT
solutions to improve training datasets, leading to more
stable and significant performance improvements in DPO. They
integrate execution-based feedback to refine the
model’s SQL generation process, making the optimization process more
reliable. And they create a quadruple-based preference
dataset to help the model learn to distinguish between correct
and incorrect SQL responses more effectively.
MONA: Myopic Optimization with Non-myopic Approval Can
Mitigate Multi-step Reward Hacking [Link]
Google DeepMind developed an innovative approach - Myopic
Optimization with Non-myopic Approval (MONA), to mitigate multi-step
reward hacking. This MONA methodology is built on two key principles.
The first is myopic optimization, where agents focus on
maximizing rewards for immediate actions rather than planning multi-step
strategies. This ensures that agents do not develop complex,
unintelligible tactics. The second principle is non-myopic
approval, where human overseers assess the agent's actions
based on their expected long-term utility. These evaluations serve as
the primary mechanism for guiding agents toward behavior aligned with
human-defined objectives, without relying on direct feedback from
outcomes.
google_deepmind_mona
The Ultra-Scale Playbook: Training LLMs on GPU Clusters -
Hugging Face [Link]
This book from Hugging Face explains 5D parallelism, ZeRO, CUDA
kernel optimizations, and compute-communication overlap in large-scale
AI training. It breaks down scaling bottlenecks, PyTorch internals, and
parallelism techniques like ZeRO-3, pipeline, sequence, and context
parallelism.
Articles and Blogs
The research found six distinct leadership styles, each springing
from different components of emotional intelligence. The styles, taken
individually, appear to have a direct and unique impact on the working
atmosphere of a company, division, or team, and in turn, on its
financial performance. And perhaps most important, the research
indicates that leaders with the best results do not rely on only one
leadership style; they use most of them in a given week—seamlessly and
in different measure—depending on the business situation. Imagine the
styles, then, as the array of clubs in a golf pro’s bag. Over the course
of a game, the pro picks and chooses clubs based on the demands of the
shot. Sometimes he has to ponder his selection, but usually it is
automatic. The pro senses the challenge ahead, swiftly pulls out the
right tool, and elegantly puts it to work. That’s how high-impact
leaders operate, too.
Leaders who have mastered four or more—especially the
authoritative, democratic, affiliative, and coaching styles—have the
very best climate and business performance.
The leader can build a team with members who employ styles she
lacks.
― Leadership That Gets Results - Harvard Business
Review [Link]
GenCast predicts weather and the risks of extreme conditions
with state-of-the-art accuracy - Google DeepMind [Link]
Morgan Stanley stated that ASICs perform exceptionally well in
certain specific application scenarios, but are highly dependent on the
custom needs of particular clients; the development cost of ASICs is
usually lower, but their system costs and Software deployment costs may
be much higher than GPUs that can be commercially scaled, leading to a
higher total cost of ownership. In addition, NVIDIA's CUDA ecosystem is
very mature and widely used in Global Cloud Computing Services, with a
market position that remains as solid as ever.
― Morgan Stanley: ASICs are overheated, and NVIDIA's position
is difficult to shake. - moomoo [Link]
NVIDIA possesses a robust competitive advantage in the AI chip market
due to its mature ecosystem, continuous R&D investments, and strong
technical capabilities.
NVIDIA's CUDA ecosystem is well-established,
enabling clients to easily deploy and run various workloads. The
maturity of this ecosystem means that customers may find it easier to
use NVIDIA products compared to adapting software for ASICs or other
alternatives.
NVIDIA has a leading position in the AI chip
market, which is reinforced by its presence on every
cloud platform across the globe. Investments within NVIDIA's
ecosystem benefit from global dissemination, further solidifying its
market dominance.
NVIDIA invests significantly in R&D. The
company is expected to invest approximately \(\$16\) billion in R&D this year. This
level of investment allows NVIDIA to maintain a 4-5 year
development cycle and continuously introduce leading
high-performance chips. Custom ASIC development budgets are typically
smaller (less than \(\$1\) billion),
giving NVIDIA an edge in innovation.
NVIDIA is difficult to surpass in providing high-end
training capabilities. The company focuses on training
multi-modal AGI models.
This means DeepResearch can identify cross-domain links or
examples that might otherwise be overlooked, offering fresh
perspectives. In professional settings, this can support more
well-rounded decision-making – for example, a product manager can
quickly gather insights from scientific research, market data, and
consumer opinions in one place, rather than relying on multiple teams or
lengthy research processes. It makes you multifaceted!
― #87: Why DeepResearch Should Be Your New Hire - Turing
Post [Link]
Deep Research and Knowledge Value - Stratechery [Link]
OpenAI launched Deep Research in ChatGPT, which is an agentic
capability that conducts multi-step research on the internet for complex
tasks. It synthesizes knowledge in an economically valuable way but does
not create new knowledge.
As demonstrated in the article, it can be useful for researching
people and companies before conducting interviews. However, it can also
produce reports that are completely wrong by missing major entities in
an industry.
This is a good point - The Internet revealed that news was
worthless in terms of economic value because the societal value does not
translate to economic value. Deep Research reveals how much more could
be known, but the increasing amount of "slop" makes it more difficult to
find the right information. Information that matters and is not on the
Internet has future economic value wrapped up in it.
Proprietary data is valuable, and AI tools like Deep Research make it
more difficult to harvest alpha from reading financial filings.
Prediction markets may become more important as AI increases the
incentive to keep things secret.
As a summary of the impact - Deep Research is a good value, but
it is limited by the quality of information on the Internet and
the quality of the prompt. There is value in the search for and
sifting of information, and this may be lost with on-demand reports. AI
will replace knowledge work. Secrecy is a form of friction that imposes
scarcity on valuable knowledge. Deep Research is not yet good at
understanding some things.
Massive Foundation Model for Biomolecular Sciences Now
Available via NVIDIA BioNeMo - NVIDIA Blog [Link]
Grok-3 (chocolate) is the first-ever model to break 1400 score and is
now #1 in Arena.
Grok 3 Beta — The Age of Reasoning Agents - Grok
Blog [Link]
Motivated by unmet needs in the modern scientific discovery
process and building on recent AI advances,
including the ability to synthesize across complex subjects and to
perform long-term
planning and reasoning, we developed an AI
co-scientist system. The AI co-scientist is a multi-agent AI system
that is intended to function as a collaborative tool for scientists.
Built on Gemini
2.0, AI co-scientist is designed to mirror the reasoning process
underpinning the scientific method. Beyond standard literature review,
summarization and “deep research” tools, the AI co-scientist system is
intended to uncover new, original knowledge and to formulate
demonstrably novel research hypotheses and proposals, building upon
prior evidence and tailored to specific research objectives.
― Accelerating scientific breakthroughs with an AI
co-scientist - Google Blog [Link]
AICoScientist-1-Components
An Interview with Uber CEO Dara Khosrowshahi About
Aggregation and Autonomy - Stratechery [Link]
Studies on the brain affirm the benefits of Tom’s visualization
technique: Imagining something in vivid detail can fire the same brain
cells actually involved in doing that activity. The new brain circuitry
appears to go through its paces, strengthening connections, even when we
merely repeat the sequence in our minds. So to alleviate the fears
associated with trying out riskier ways of leading, we should first
visualize some likely scenarios. Doing so will make us feel less awkward
when we actually put the new skills into practice.
― Primal Leadership: The Hidden Driver of Great Performance -
Harvard Business Review [Link]
Imagine it, fake it, and make it.
Our research tells us that three conditions are essential to a
group’s effectiveness: trust among members, a sense of group identity,
and a sense of group efficacy.
― Building the Emotional Intelligence of Groups - Harvard
Business Review [Link]
Team is so important to leaders.
Interrupt the ascent.
When people are continually promoted within their areas of
expertise, they don’t have to stray far from their comfort zones, so
they seldom need to ask for help, especially if they’re good problem
solvers. Accordingly, they may become overly independent and fail to
cultivate relationships with people who could be useful to them in the
future. What’s more, they may rely on the authority that comes with rank
rather than learning how to influence people. A command-and-control
mentality may work in certain situations, particularly in lower to
middle management, but it’s usually insufficient in more senior
positions, when peer relationships are critical and success depends more
on the ability to move hearts and minds than on the ability to develop
business solutions.
― The Young and the Clueless - Harvard Business
Review [Link]
Don't fall into the independence trap.
Accelerating scientific breakthroughs with an AI co-scientist
- Google Research [Link]
Introducing Perplexity Deep Research - Perplexity
[Link]
Trend is on deep research.
Shopify Tells Employees to Just Say No to Meetings -
Bloomberg [Link]
Who will control the future of AI? - The Washington
Post [Link]
Sam promotes a U.S.-led strategy to ensure AI development aligns with
democratic values and remains under the leadership of the U.S. and its
allies.
This new architecture used to develop the Majorana 1 processor
offers a clear path to fit a million qubits on a single chip that can
fit in the palm of one’s hand.
Microsoft is now one of two companies to be invited
to move to the final phase of DARPA’s Underexplored Systems for
Utility-Scale Quantum Computing (US2QC) program – one of the programs
that makes up DARPA’s larger Quantum
Benchmarking Initiative – which aims to deliver the industry’s first
utility-scale fault-tolerant quantum computer, or one whose
computational value exceeds its costs.
― Microsoft’s Majorana 1 chip carves new path for quantum
computing - Microsoft [Link]
In the near term, Google’s approach with superconducting qubits (like
Willow) is more mature. This technology has already demonstrated
impressive benchmarks and is backed by years of incremental
improvements. Its error correction techniques, while still challenging,
are well‑studied, and scaling up using transmon qubits is an area where
significant progress has been made.
On the other hand, Microsoft’s topological approach with Majorana 1
aims to use a completely new type of qubit—one that is “protected by
design” thanks to its topological nature. In theory, this means lower
error rates and potentially a much more scalable architecture with fewer
physical qubits needed per logical qubit. However, this method is still
very experimental, and questions remain over whether true Majorana zero
modes have been reliably created and controlled.
In summary, for near‑term practical applications, Google’s path
appears to be the safer bet. But if Microsoft’s topological qubit
platform can overcome its technical hurdles, it may ultimately provide a
more efficient and scalable route to fault‑tolerant quantum
computing.
OpenAI tries to ‘uncensor’ ChatGPT - Techcrunch [Link]
Elon Musk Ally Tells Staff ‘AI-First’ Is the Future of Key
Government Agency - WIRED [Link]
Thomas Shedd, a former Tesla engineer and ally of Elon Musk, is
implementing an "AI-first strategy" at the General Services
Administration's Technology Transformation Services (TTS). Shedd
envisions the agency operating like a software startup, automating tasks
and centralizing federal data. This shift is causing concern among GSA
staff, who report being thrown into unexpected meetings and facing
potential workforce cuts. Shedd is promoting collaboration between TTS
and the United States Digital Services (DOGE), though specifics about
the new AI-driven projects and data repository remain unclear. A
cybersecurity expert expressed concern that automating government tasks
is difficult and the attempt is raising red flags. Employees also voiced
concerns regarding working hours and potential job losses.
My thoughts regarding the AI landscape at the current stage:
As open-source AI becomes more affordable, it is poised to become as
ubiquitous and accessible as electricity—financially viable for
everyone. The AI and AGI arms race, whether between nations, open- and
closed-source models, or competing companies, is effectively over or
should be over, and the outcome is clear. Compute power still remains
essential, but semiconductor giants like NVIDIA should look beyond
language model training and inference, shifting their focus to the next
frontiers, such as robotics and world models. Now is the time for
developers and startups to concentrate on the vertical integration of
AI, where real economic value can be realized.
DeepSeek - Background
DeepSeek began as a research offshoot of
High-Flyer—a hedge fund that had already amassed a
large GPU inventory (reportedly 10,000 Nvidia A100s in 2021). Over time,
this resource base appears to have grown, with estimates suggesting
that—when you account for research, ablation experiments, and shared
infrastructure with trading—the effective pool might be closer to 50,000
GPUs. This expansive compute power enables DeepSeek to run many
experiments simultaneously and quickly iterate on new architectures.
By leveraging a shared infrastructure with its hedge fund operations,
DeepSeek can reinvest profits from quant trading into AI research. This
model of “doing more with less” not only challenges the notion that
massive, multibillion-dollar compute expenditures are necessary to build
world-class AI models but also has broader implications for the
industry. It raises questions about the future economics of AI
development and the potential for more cost-efficient, research-driven
models to shift market dynamics, as seen by the notable impact on
Nvidia’s stock and market sentiment.
Export Controls on GPUs to
China
In essence, the U.S. government originally imposed limits on chips
that exceed certain thresholds in both interconnect
bandwidth and compute (FLOPs) to restrict
China’s ability to train massive AI models. Early on, chips that
combined high interconnect speeds with high FLOPs were off‐limits.
For example, the H100—one of Nvidia’s top GPUs—was deemed too
powerful. In response, Nvidia developed the H800, which maintained the
same floating point performance (FLOPs) as the H100 but had its
interconnect bandwidth intentionally reduced to meet U.S. export
criteria. However, when the government later decided to tighten controls
further (targeting chips solely on FLOPs), even the H800 was banned.
This led Nvidia to innovate once again with the H20, a chip that now
offers full interconnect bandwidth (and even improved memory
characteristics over the H100) but with a deliberate cut in overall
FLOPs to satisfy export rules.
The strategic rationale behind these controls is to “decap” China’s
compute—especially for large-scale AI training—by limiting how many of
the most advanced GPUs (and thus the overall density of compute) can be
legally acquired. While Chinese companies can still purchase GPUs to
train models, the overall capacity available for training (which is
critical for developing super-powerful AI) is being capped. This is seen
as a way to maintain U.S. and allied leadership in AI, particularly in a
world where super-powerful AI may soon offer decisive military and
economic advantages.
Sidenote - GPUs for AI
Keys GPU Specifications:
FLOPS (Compute Power): Critical for training large
models (e.g., GPT-4) but less critical for inference tasks like
reasoning.
Memory Bandwidth/Capacity: Determines how much data
(e.g., KV cache in transformers) can be stored and accessed quickly,
crucial for long-sequence tasks.
Interconnect Speed: Affects communication between
GPUs in clusters, important for distributed training but less regulated
now.
H20 vs. H100: Tradeoffs for AI Workloads:
H20 (China-Specific): has its strength in higher
memory bandwidth and capacity than H100, making it better suited for
reasoning tasks (e.g., long-context inference,
chain-of-thought). However, FLOPS (≈1/3 of H100 on paper, ≈50-60% in
practice) is reduced, limiting its utility for training.
Regulatory Context: Designed to comply with U.S.
export controls that focus on FLOPS, allowing Nvidia to ship 1M units to
China in 2023 (20-25% of total GPUs).
H100: Optimized for FLOPS-heavy training but
less efficient for memory-bound inference tasks
Why Memory Matters for Reasoning:
KV Cache in Transformers stores keys/values of all
tokens in a sequence for attention mechanisms. Memory demands grow
quadratically with sequence length (e.g., 10K+ tokens
in reasoning tasks).
Autoregressive Generation: Output tokens require
sequential processing, forcing repeated KV cache access. This limits
parallelism and increases memory pressure. Tasks like agentic AI or
chain-of-thought involve generating long outputs (10K+ tokens),
stressing memory bandwidth/capacity.
DeepSeek trains on Nvidia GPUs. These are equipped
with many cores (organized into streaming multiprocessors, or SMs) that
perform the heavy lifting during both training and inference.
The GPUs they used were those legally available in
China, which imposed certain limitations—especially on
interconnect bandwidth between units. This meant that DeepSeek needed to
overcome hardware constraints that might not be present with the very
latest high-end GPUs elsewhere.
Custom Low-Level Optimization
Instead of relying solely on Nvidia’s standard NCCL (Nvidia
Communications Collectives Library) for handling inter-GPU
communications, DeepSeek’s engineers developed custom scheduling
techniques. They even scheduled communications at the SM
level, which is more granular than the typical approach.
Their implementation involved programming approaches that went deep
into the hardware—down to using PTX (an intermediate
assembly-like language for CUDA). This allowed them to squeeze extra
efficiency from each GPU by reducing the overhead in communication
between layers of the model.
Efficiency via Architectural Choices
One of the key innovations was using a sparse Mixture of
Experts (MoE) architecture. With a model that can have hundreds
of billions of parameters overall but only activates a fraction (e.g.,
around 37 billion at a time), the compute and memory demands are
dramatically reduced. This architectural choice means that even if the
hardware isn’t the absolute latest, it can still be very cost-effective
by not needing to run every parameter for every token.
DeepSeek's novel attention mechanism MLA (Multi-Head Latent
Attention) reduces memory usage by 80–90% compared to
traditional transformer attention. This optimization lowers
computational costs, especially for long-context processing, without
sacrificing performance.
By optimizing both the hardware usage (through custom scheduling and
low-level programming) and the model architecture (via MoE and MLA),
DeepSeek manages to cut down on the cost of training. This is crucial
given the significant compute expense associated with large-scale
language models.
Pre-Training and Context Window Extension
Pre-trained on 14.8 trillion tokens drawn from a multilingual corpus
(primarily English and Chinese) with a higher proportion of math and
programming content compared to previous iterations.
Utilizes a two-phase extension (via the YaRN framework) to expand
the context length from 4K tokens to 32K and finally to 128K
tokens.
Reported training cost for V3 is approximately $5.58 million,
consuming about 2.788 million GPU-hours on Nvidia H800 GPUs. This figure
is significantly lower than the hundreds of millions typically reported
by US rivals.
V3 is fine-tuned on a carefully curated dataset of approximately 1.5
million examples (both reasoning and non-reasoning tasks) to improve
instruction-following and output formatting.
DeepSeek employs GRPO—a group relative
policy optimization method—to reward outputs based on
correctness (accuracy rewards) and presentation (format rewards).
R1 leverages RL to fine-tune the reasoning process, rewarding
chain-of-thought quality and encouraging the model to generate
self-reflective “aha moments.”
Speed-to-Market and Safety Tradeoffs
DeepSeek prioritizes rapid deployment over extensive safety
testing, avoiding delays and costs associated with ethical reviews
(common in Western firms like Anthropic). This "ship-first" approach
reduces development cycle expenses.
Releasing model weights publicly attracts third-party hosting and
innovation, indirectly expanding reach without bearing full
infrastructure costs.
The Tech and Business
Perspective
The release of DeepSeek-R1 marks a pivotal moment in the AI industry,
igniting discussions about open-source dominance, market disruption, and
geopolitical implications.
Industry Leaders Weigh In:
Yann LeCun (Meta’s Chief AI Scientist)
LeCun emphasized the growing power of open-source models over
proprietary approaches:
"To people who see the performance of DeepSeek and think China is
surpassing the US in AI. You are reading this wrong. The correct reading
is: Open source models are surpassing proprietary ones."
Andrej Karpathy (OpenAI Co-founder)
Karpathy pointed out the continued need for large-scale computing
while praising DeepSeek’s efficiency:
"Does this mean you don't need large GPU clusters for frontier
LLMs? No, but you have to ensure that you're not wasteful with what you
have, and this looks like a nice demonstration that there's still a lot
to get through with both data and algorithms."
Satya Nadella (Microsoft CEO)
Nadella underscored the significance of DeepSeek, highlighting its
role in making AI reasoning more accessible:
"We should take the developments out of China very, very
seriously.""DeepSeek has had some real innovations. …
Obviously, now all that gets commoditized.""When token prices
fall, inference computing prices fall, that means people can consume
more, and there will be more apps written."
"DeepSeek had a few pretty novel infrastructure optimization
advances, which, fortunately, they published them, so we can not only
observe what they did, but we can read about it and implement it, so
that'll benefit us.""Always interesting when there's someone
who does something better than you. Let's make sure we are on
it."
Aravind Srinivas (Perplexity AI CEO)
Srinivas stressed the importance of foundational innovation:
"We need to build, not just wrap existing AI."
Marc Andreessen (Andreessen Horowitz Co-founder)
He likened DeepSeek-R1 to a historic milestone:
"DeepSeek R1 is AI's Sputnik moment."
Tim Cook (Apple CEO)
Cook gave a measured response during an earnings call:
"In general, I think innovation that drives efficiency is a good
thing."
Academic and
Research Perspectives
AI Researchers on DeepSeek-R1:
Timnit Gebru (AI Ethics Researcher)
Gebru reflected on past AI development priorities:
"At Google, I asked why they were fixated on building THE LARGEST
model. Why are you going for size? What function are you trying to
achieve? They responded by firing me."
Ethan Mollick (Wharton AI Professor)
Mollick focused on accessibility rather than capabilities:
"DeepSeek is a really good model, but it is not generally a
better model than o1 or Claude. But since it is both free and getting a
ton of attention, I think a lot of people who were using free 'mini'
models are being exposed to what an early 2025 reasoner AI can do and
are surprised."
Andrew Ng (AI Researcher and Entrepreneur)
Ng saw the market reaction as an opportunity for developers:
"Today's 'DeepSeek selloff' in the stock market—attributed to
DeepSeek V3/R1 disrupting the tech ecosystem—is another sign that the
application layer is a great place to be. The foundation model layer
being hyper-competitive is great for people building
applications."
Global Academic Community Response:
Huan Sun from Ohio State University noted that DeepSeek's
affordability is expanding LLM adoption in research. Cong Lu from the
University of British Columbia highlighted R1’s rapid adoption,
surpassing 3 million downloads on Hugging Face in a week. Meanwhile,
safety concerns emerged as studies revealed R1 is 11 times more likely
to generate harmful content compared to OpenAI models, prompting calls
for better safeguards.
Impact Discussion
Market and Industry Impact
The release of DeepSeek-R1 caused massive shifts in financial
markets. U.S. tech stocks collectively lost \(\$1\) trillion, with Nvidia suffering
record losses due to the rising competition from this cost-efficient
model. Investors are recalibrating AI development strategies as DeepSeek
achieved comparable performance to OpenAI’s models at just \(\$6\) million versus OpenAI’s \(\$100\) million.
Integration into Cloud Ecosystems
AWS and Microsoft Azure have incorporated DeepSeek-R1, enabling
developers to explore its capabilities securely and cost-effectively.
The emergence of cost-effective models like DeepSeek R1 is forcing a
shift in AI economics, emphasizing efficiency over massive capital
investments. As a result, competition in the AI sector is intensifying,
ushering in a “warring states era” where companies are scrambling for
innovation in cost-effective models.
Geopolitical and National Security Implications
The success of DeepSeek R1 has intensified concerns that the U.S. is
losing its technological edge to China. Policymakers are reassessing
export controls on advanced chips in light of DeepSeek's ability to
innovate using restricted hardware. Security concerns have also prompted
the U.S. Navy to ban the use of DeepSeek R1 due to potential security
and ethical risks, fueling debates over the implications of adopting
foreign-developed AI systems.
Open-Source vs Proprietary Models
DeepSeek R1 is accelerating the democratization of AI by lowering
barriers for smaller developers and researchers, fostering innovation.
However, transparency concerns remain as DeepSeek has not disclosed its
training data, raising ethical and bias-related questions.
Ethical and Technical Questions
Concerns have emerged regarding potential censorship, as some
versions of DeepSeek R1 appear to align with Chinese narratives.
Additionally, skepticism exists over whether DeepSeek’s reported costs
and capabilities are fully accurate, with some experts questioning the
factors that contributed to its success.
Public Sentiment and the Future of AI
Public reaction to DeepSeek-R1 has been mixed. Some view this as a
“Sputnik moment,” encouraging U.S. firms to accelerate AI innovation
while leveraging open-source models to stay competitive. Others see it
as a wake-up call, with former President Donald Trump urging U.S.
industries to adapt quickly to maintain leadership in AI
development.
Persistence
is a cornerstone for building robust and production-grade applications.
LandGraph introduces a game-changing feature that ensures application
states are stored and retrievable at any point. This redefines
reliability and scalability in workflow management. This capability is
especially vital when executing workflows involving interruptions, user
inputs, or debugging. Whether you're building a simple app or an
enterprise-grade system, persistence ensures your application is always
ready to handle interruptions and user interactions gracefully.
The "Persisting Agent Stage" enables seamless workflows, especially
in user-facing applications. Here’s why this feature is critical:
Human-in-the-Loop Workflows: Many applications rely
on user input to make decisions or advance processes. With persistence,
LandGraph allows the graph execution to pause, checkpoint the state into
persistent storage, and resume later. This means the application can
wait for user input and continue without losing context.
Debugging and History: Persistence creates a robust
mechanism for saving the application state after every step. This makes
debugging easier and enables the creation of detailed execution
histories.
Support for Multi-Session Scenarios: Applications
often require users to switch between sessions while maintaining their
progress. Persistence ensures continuity by saving states into
persistent storage.
At the heart of this feature is the CheckPointer
object, a persistence layer implemented by LandGraph. Here’s how it
works:
Integration with Databases The CheckPointer can
save states into various database types, including:
Document databases: Firestore, MongoDB
Relational databases: PostgreSQL, SQLite,
MySQL
Graph databases: Neo4j, AWS Neptune
For example, the following section will focus on persisting states
into an SQLite database, a popular choice for local environments. The
process can also be extended to managed cloud databases like Google
Cloud SQL or AWS RDS.
State Management As each node in the graph
executes, the CheckPointer saves the updated state into the database.
This ensures that states are recoverable after interruptions, enabling
the graph to resume execution from exactly where it left off.
To implement persistence, follow these simple steps:
Import the CheckPointer object from LandGraph.
Create an instance of CheckPointer and configure it with a
connection string (local or cloud-based database).
Pass the CheckPointer instance to your graph during creation.
LandGraph will handle state persistence automatically after each node
execution.
1 2 3 4
from langgraph.checkpoint.sqlite import SqliteSaver
The result is that you can pause the graph, fetch user input, and
continue execution seamlessly, all while ensuring states are securely
stored in your chosen database.
MemorySaver +
Interrupts = Human In The Loop
Human-in-the-loop systems are essential to modern applications,
allowing seamless integration of human feedback into automated
workflows. With the help of the MemorySaver feature,
you can build applications using LangGraph that pause, capture user
input, and resume execution effortlessly.
In workflows involving human interaction, there are moments where the
application needs to pause, gather feedback from the user, and then
continue processing. For instance, consider a sequence of tasks
where:
A process executes its initial steps.
The system pauses to collect human input.
The workflow resumes, incorporating the user’s feedback.
This type of flow requires interrupts to halt the
execution and persistence to save the current state of
the workflow. Langraph provides the tools to manage both seamlessly.
Implementation
To illustrate, let’s build a straightforward graph with the following
steps:
Start with a simple initial node.
Execute a task and pause for human feedback.
Resume execution with the updated state and complete the
workflow.
We use Langraph's MemorySaver, a checkpointing tool
that saves the workflow’s state in memory after each node’s execution.
This ephemeral storage method is perfect for local testing and
prototyping. Here’s a simplified version of the setup:
The graph visualization by using Mermaid.ink
is here:
hitl-graph
MemorySaver Implementations
Integrating human feedback into automated systems is a growing trend
in AI development. It bridges the gap between machine automation and
human judgment, enabling better decision-making, improved accuracy, and
adaptability. In this section, we explore how to incorporate
human-in-the-loop functionality into a graph-based system while
leveraging memory storage to track execution states. This walkthrough
showcases the process from initialization to final execution.
print("### State after update ###") print(graph.get_state(thread))
print(graph.get_state(thread).next)
for event in graph.stream(None, thread, stream_mode="values"): print(event)
The graph’s execution is tied to a thread variable, a
dictionary initialized with a thread_id. This serves as a
session or conversation identifier, distinguishing various graph runs.
For simplicity, the thread_id is set to 1,
though a more robust implementation would use a UUID. The graph
processes events using graph.stream(), which accepts the
initial input and thread details. Events are streamed in value mode, and
each event is printed for transparency.
During execution:
Input is processed.
Node executions are logged.
Interruptions allow for dynamic human input.
Running the graph in debug mode provides insights into:
Memory storage (memory.storage) containing nested
objects that log the graph state.
Transition logs for each node, showing updates or lack thereof.
At an interrupt, human feedback is solicited using Python's built-in
input() function. This input updates the state dynamically.
Once human input is integrated, the graph resumes execution. Subsequent
steps process the updated state, leading to the graph’s completion.
SqliteSaver
Switching from an ephemeral memory-based state saver to a persistent
database saver can significantly enhance the durability and traceability
of your graph’s execution. In this section, we’ll explore how to replace
the in-memory MemorySaver with an SQLiteSaver
for long-term storage and easy debugging.
The MemorySaver is transient, meaning all state
information vanishes after the program stops. By using an SQLite
database, you can:
Persist graph states across runs.
Debug and troubleshoot using a structured database.
Resume executions exactly where they were interrupted.
print("### State after update ###") print(graph.get_state(thread))
print(graph.get_state(thread).next)
for event in graph.stream(None, thread, stream_mode="values"): print(event)
We start by importing the required modules. Then Initialize a
connection to your SQLite database. The
check_same_thread=False flag ensures thread-safe database
operations, essential for stopping and restarting execution across
different threads. After that we create an instance of
SQLiteSaver and pass it the SQLite connection. This saver
integrates seamlessly with the graph execution pipeline, persisting
states to the SQLite database.
Initial Execution: Run the graph with the
SQLiteSaver. After execution, you’ll see a new file,
checkpoints.sqlite, created in your project directory.
Inspect the Database: Use your IDE’s database tools
(e.g. SQLite3 Editor for VS Code) to load and inspect the
checkpoints.sqlite file. You’ll find a table storing graph
states, similar to what you’d see with MemorySaver, but now
it’s persistent.
screenshot_sqlite_ide
Changing the thread_id allows you to simulate a new
session while retaining access to previous runs. When resuming, the
graph starts from the last recorded state. You can verify this by
inspecting the database entries for the new thread_id.
For enhanced traceability, integrate Langsmith for tracking and
debugging. Langsmith provides detailed insights, including thread
metadata and execution traces.