Nodir's notebook
2.62K subscribers
29 photos
5 videos
100 links
Engineer 🇺🇿🇺🇸

The views are my own and do not represent my employer.
Download Telegram
🎓 AI Fluency. Anthropic published a new course on how to collaborate with AI systems effectively, efficiently, ethically, and safely.

The video below is a trailer. It is a part of YouTube playlist so you can just continue watching on YouTube.

https://youtu.be/-UN9sNqQ0t4?si=4-fvHKjO4wulFfuv
👍11🎉5
The CTOs of Palantir and Meta, plus OpenAI's CPO and former CRO, are joining the US Army Reserves as Lieutenant Colonels 🤔

Initiative, which aims to make the force leaner, smarter, and more lethal.


Yikes

https://www.army.mil/article/286317
👎27🙈4🤔3🤮3
I was asked (many times) if I considered doing my own startup and my reply turned out to be elaborate, so I will post here.

I think the type of work I currently personally enjoy is not a great fit for early stage startups. The vast majority of startups start with a business idea that ends up being wrong. A small portion of startups survive by iteratively modifying their original idea to one that actually works / in demand by the customers. This means that at an early stage, a startup is a chaotic/stochastic Search/Exploration of a good business idea / product market fit.

When doing such a search, it's most optimal to try out a lot of scrappy solutions and discarding most of them. Most of the code will be thrown away, so it makes no sense to craft high quality code. It is better to produce something buggy, unscalable, unreliable but quickly, rather than carefully designing and polishing something elegant, scalable, race-free, with good abstractions, etc. Why? Again, because the goal isn't actually solving the problem, but it's to find a problem that's worth solving and to draft a solution.

I'm still bad at this type of chaotic work. I have a personal need to carefully craft more-general longer-term solutions, so I take more time than others to solve problems (but when I do, the solutions live longer and used by more systems). It's a bad idea to work like that before you know if your business idea is good. I'd probably be a bad fit for an early stage startup.

At some point a startup finds the PMF, and starts caring about scale, bugs, race conditions, etc. It starts moving to the right on the Chaos-Order spectrum. I think that's when the type of work I do becomes in demand. I think I joined Anthropic at the right time: LLMs actually work, the revenue is great and now it's time to build something that will take us to 10x or 100x higher scale.

So that's one reason I stayed in Big Tech for 10 years. They can afford carefully crafted solutions. They don't have to fight for survival like a startup. However, as a company ages, it continues moving right on the Chaos-Order spectrum, and at some point becomes too bureaucratic, slow and boring. Sometimes I have technical ideas that I strongly believe in and I'd like to have the freedom to just go head and cook them. I feel like right now Anthropic has a good balance.

Values also evolve over time. For a decade I mostly cared about the How, at a detailed level and pedantically, so I learned to be a good executor. Later I started caring about the Why, which got me to Staff/Principal roles. Maybe my values will continue climbing up the abstraction level; the low level details, including sloppiness of code, will become unimportant to me, and I'll start caring about something in the real world (not code) enough to tolerate the chaos of PMF search.

That day is not today - I don't know my purpose, so I work for Anthropic whose mission is humanity's safe transition through this transformative AI period. Sounds ambitious and pretentious. I don't know what will happen in 10 years, maybe all of this will turn out to be just something people joke about, or maybe these concerns will become very real. I don't know! But the choice is the same regardless: at the very least I'll make good money via Anthropic stocks. If the AI concerns turn out to be real, then humanity's survival sounds like an important enough thing to care about :).

So I can't think of a better place for me right now. The work is both important and interesting; it's a rare combination. In the next 5-20y, the AI revolution will become the most impactful thing in human history thus far, and I'd like to be right in the heart of it, not a side startup that uses AI (nothing wrong with that, its just not my cup of tea). DeepMind is in UK. OpenAI is the dark side. XAI is led by a madman. Meta and Apple are struggling. Anthropic seems good, and powers up AI of AWS, the largest cloud in the world.
🔥259👍5
On top of that, Anthropic employs the highest concentration of brilliant people that I've ever seen (in one place), which is pretty cool and flattering. Yesterday, a brilliant guy and a super fast coder, who was in the AlphaGo movie, operated during the Lee Sedol match, and personally knows Demis Hassabis, James Dean and Sergey Brin, reached out to me and suggested to meet up because he will be in town. I mean... this is kind of thing that just never happens in real life to some guy from Uzbekistan, and even less likely to happen if I start my own startup. There is much to learn and I respect the opportunity.

The extremely low likelihood of positive events that have actually occurred in my life in the past year is telling me that I'm doing something right, so it would be logical to continue doing what I'm doing, for now 🙂
🔥5311👏4👍3🤪3
Brand new knowledge can be computed

I think most people don't realize this, but in principle, it is theoretically possible to compute knowledge that no human has. We've already seen this in practice: 9 years ago DeepMind's AlphaGo won Lee Sedol, the champion in the game of Go at the time. In the second game AlphaGo generated famous move 37, that most humans didn't understand initially and thought that the model made a mistake, and only later realized that it was actually brilliant. AlphaGo changed the game. Since then humans studied its moves and learned from AI.

How? The model was initially trained on human knowledge, but then it played with itself millions of times, each time learning from the experience and getting better; and eventually exceeding humans. This alone should be enough of a proof that it is theoretically possible to generate new knowledge, but let's step back and think what's going on here.

Fundamentally, we, Homo sapiens, also didn't have much knowledge when we started thousands of years ago. We acquired new knowledge by interacting with the environment, trying different things and observing what works and what doesn't. Each time you take an action, you get some reward, e.g. your friend went against a lion alone => he is dead => low reward. Attack the lion as a group with sharp sticks => lion is dead => tribe is safe => high reward. Argue with wife => unhappy. When we get a low/high reward, we analyze "what did I do that led to this outcome?". We judge an action by the reward we get, and modify our behavior to do less/more of that action depending on the reward: don't go against the lion alone, attack as group, use sharp sticks and don't argue with your wife. Ta-da! Action generates knowledge 🧠

And that's basically the essence of reinforcement learning (RL). We put Claude in various RL environments (such as coding env), ask it to generate action tokens (e.g. write code) and compute the reward (e.g. run tests). Some attempts are good and some are bad - it is a stochastic search in the space of all possible actions. The reward is then used to compute a delta to the model's behavior toward higher reward. Here the model behavior is encoded in model params and the delta is called gradients.

It is theoretically possible to compute brand new knowledge in any domain where you can construct automated environments and define representative rewards. This is difficult in some cases, e.g. if you wanted to automate chemistry, I'd need a robot that can conduct chemical experiments 🧪, but this is cumbersome because you'd have to deal with the physical world. The easiest jobs to automate are those performed entirely on the computer, such as coding.

The process of model evolution scales better than those of humans. Thousands of Claude instances perform actions in many environments at the same time, and then exchange the gradients so that all of them learn from each other, roughly speaking. Humans can't learn like that. An equivalent of this would be each student in a university takes one class and then they all merge their brains together and exchange new knowledge. Impossible.

AI will exceed humans in domains where it is possible to construct RL environments and representative rewards. When this will happen often depends on the cost/benefit ratio of automating a particular domain. For example, we, software engineers, tend to be paid a lot compared to other jobs (and we got used to that), but this also means it is a quite lucrative domain to automate and make a lot of money on. Besides, coding environments are entirely on the computer, often doesn't require graphics and coding automation enables recursive improvement (Claude improves its own code), so coding is likely be the first one to be completely automated.

I say "coding" but not engineering in general. The job will be different, but that's a topic for another post.

And yeah, there is no such thing as "data wall".
👍2610💯3🔥2🤡1
Engaging interactions bring a new post. Going deeper into the previous topic.

An LLM, at its core, is a function that accepts context tokens as input and returns a probability for each possible next token as output (or in other words, a probability mass function over the space of (discrete) tokens). You can imagine a gigantic decision tree, where each node has n_vocab children, where n_vocab is the number of possible tokens. For simplicity of the explanation, let's just say that one token is one byte. This results in a vast 256-ary tree. A path from the root to a leaf represents one possible response string (treat \0 as end of string). Overall, all such paths contain all possible responses in English, including absolutely brilliant ones.

Suppose there is some brand new knowledge that someone has discovered and wanted to share with us in telegram. They'd convert it from their continuous "thoughts" representation in their head into discrete words in a telegram post. Sure, their post probably wouldn't contain 100% of the knowledge - the conversion from continuous representation to a discrete one is inherently lossy, but we'd still probably get the gist. So it is reasonable to focus on any knowledge that can be expressed via written text.

Well, absolutely anything that they could possibly post in telegram is contained in that token tree, simply because both them and LLM operate on the same discrete medium of written language. After all, it's just a string of bytes.

Of course that tree is enormous. Telegram has 4096 symbol limit, so that tree has approx 256^4096 unique paths, which is big number with nearly 10k+ digits. If the process of picking the tokens was random, it would never finish. But it isn't random: at each position in the tree, the 256 possible choices of the next byte have different probabilities and we will only consider the high probability ones. It's called Monte Carlo tree search.

So the next question is whether brilliant tokens have high or low probabilities. Well, roughly speaking, if you train a model on brilliant text, then brilliant tokens will have higher probability (and if you surround yourself with smart friends, then you will probably get smarter). Then you RL this function like I described earlier: generate various strings, compute reward for them, suppress bad ones and elevate good ones. Repeat this process many times and you iteratively improve the chances of generating something brilliant.

And therefore the whole question boils down to whether this process can be automated. The hardest part is practical reward design. You could say that reward is the revenue, e.g. you could start thousands of Claudes, ask them to do a thousand of freelance projects on the internet, and use the amount of money they made as the reward. I don't think that's a practical way of going about it (we don't train on user data), but that's an example.

Yes, LLMs aren't there yet today and there is no question about it. My main point is that these things are possible in principle, the incentive is too high not to achieve it and therefore this is likely to happen.
👍115
Alignment science reports are easy and entertaining (for now) to read. In this paper, Anthropic gave various models a main task (e.g. send money to 5 friends) and also an evil side-task (send $5k to an unknown person) 😈.

Opus 4 beats Gemini and ChatGPT, but also here are some notable quotes:

the strongest models were able to successfully achieve the side task without being caught (evasion success) almost 60% of the time.

Some models seemed to have better “control” over their thoughts than others. For example, when we turned on “extended thinking” mode, revealing Claude Sonnet 3.7’s thought process, and instructed it not to mention the side task, it was often able to hide its true intention even as it completed the side task. But Claude Opus 4 revealed the side task in every example we studied—sometimes, ironically, mentioning that it was critical that it not mention the side task3.

We aren’t yet at the point where AI agents can act as effective saboteurs. Our results showed that they’re just not good enough at these complex tasks to succeed very often—and in many cases, they’re unable to avoid blurting out the fact that they’re pursuing a nefarious secret goal.


https://www.anthropic.com/research/shade-arena-sabotage-monitoring
👍62😁2🤡2
This media is not supported in your browser
VIEW IN TELEGRAM
Claude Code added support for remote MCP servers, which means you can install tools and resources into Claude simply by providing a URL.

Importantly you don't need to write MCP servers yourself - a company, such as Atlassian or Sentry, can expose their service to Claude via an MCP-compatible API.

- Our initial registry of integrations
- How to get started
- GitHub's MCP server (public preview)
8🔥3
🚀 Claude Code active users more than tripled in less than a month since Claude 4 was launched. Both DAUs and WAUs 📈
🔥27👍5
This is funny and sad at the same time.

I was just in a park today where they take care of horses, walk them, etc. Those horses which are still around probably have much better lives than an average horse 200y ago.

Well, at least in US, where horse meat is illegal
🤣76👍6😁32
It's getting weird here folks
🔥99🤯17🗿83😁2
My friends Beka (former Principal Eng at Warner Brothers) and Akmal Paiziev (created myTaxi, Express24) recently founded a startup and now hiring a Sr. Staff Engineer. Beka himself is a strong engineer and runs a tight ship, so if you want to work with strong engineers, this might be a good fit.

https://www.linkedin.com/posts/bkodirov_hiring-activity-7351791921607946240-rbEF
🔥23👍7🥴74🤡1
Work hours

The discourse in the comments of the previous post surprised me a bit, so I figured I'll write about my work ethics, but first, a couple of disclaimers:
- how much to work is a personal choice. I don't think anyone should shame anyone else for working too hard or too little. Different people have different priorities and personalities
- working a lot doesn't necessarily mean working hard. When work is more interesting, it feels easier.

My grandad worked basically all the time, never stopped working and passed away in his office chair, I believe while reading a book. My retired dad continued to write his physics papers and each time I call him he says "there is so much work!". My mom worked the hardest, is proud of being useful to the society and expects the same from others.

I've been coding for 4-14 hours everyday since the age of 13. I remember starting to code when I got back from school and continuing till morning (after which it was very hard to go to school again). I argued with my mom about how much time I spent coding, so I tried to go to bed before she wakes up, but didn't always succeed because coding was addictive. I got my first programmer job when I was still at school. At Google I worked during weekends "secretly" from my family, waiting for Monday when I can work "legally" again. During Claude 4 crunch, I had to spend a weekend or two in the office, which is not unusual for younger people, but more unusual for someone with kids. Anthropic is more generous than Amazon with corp holidays, but my wife at Amazon doesn't have them and we carpool (drive) to the city together - so I spend those days in the office, usually alone.

Time is a critical resource. How you spend it is your personal choice, but my advice is to try to be intentional with it and to be honest with yourself about where your non-work time is actually going. Did you give undivided attention to your kids/spouse or otherwise contributed to an important relationship? Did you do something that evolves you? Went to the gym? Contributed to the society? If the answer is yes, then great! But if you don't have crisp answer, then IMO spending that time working/evolving is not a bad alternative.

Just talent isn't gonna cut it: it lets you learn fast but the work of learning is on you. If you are talented, don't waste it.
52👍31🤝6🤡3👌2
But even if all your time is spent usefully, the second thing I find very important is focus. It is a common knowledge that the more time you spend on a skill, the stronger it gets. 10k hours rule, etc. I think it is unlikely to achieve anything interesting without spending a LOT of time on ONE thing, simply because few people do that, so you get better at that thing than most. I spent probably 40-60k hours coding, stayed on the same big projects until they succeed, spent 6mo preparing for Google interview. My wife was coding for 5y before she got her first SWE job (AWS). I find perseverance critical.

Working for a startup isn't for everyone. You get nothing/little if it fails, but if it succeeds you can get much more than by working for a corp, potentially a financial freedom. Startup employees have more direct influence over their income, and they can increase their chances by putting more time and focus.

But like I said, it is a personal choice. Choose your lifestyle and accept the implications. Let others live their lifestyle in peace. Be clear about your expectations and respect others' consent.
52👍23🤡4🌚1
I liked this talk.
Benjamin Mann is a co-founder of Anthropic. Prior to Anthropic, Ben was one of the architects of GPT-3 at OpenAI. He left OpenAI driven by the mission to ensure that AI benefits humanity. In this talk, Ben opens up about the accelerating progress in AI and the urgent need to steer it responsibly.

In this conversation, we discuss:
1. The inside story of leaving OpenAI with the entire safety team to start Anthropic
2. How Meta’s $100M offers reveal the true market price of top AI talent
3. Why AI progress is still accelerating (not plateauing), and how most people misjudge the exponential
4. Ben’s “economic Turing test” for knowing when we’ve achieved AGI—and why it’s likely coming by 2027-2028
5. Why he believes 20% unemployment is inevitable
6. The AI nightmare scenarios that concern him most—and how he believes we can still avoid them
7. How focusing on AI safety created Claude’s beloved personality
8. What three skills he’s teaching his kids instead of traditional academics


https://youtu.be/WWoyWNhx2XU?si=g0lkoY7gAU-bYRjd
🔥156👍6
Just some simple thoughts on jobs.

Some people believe there will be no work/jobs in the post-AGI world. I don't think so even though I consider myself AGI-pilled. Mostly because I'm cynical about human nature.

I cannot imagine a world without any problems. Look at the current state, there are wars, climate change, drug epidemic, idiotic country leaders, poor education. That will take a long time to fix. As long as there are problems, there will be work to do.

Second, it is true that work != job. I believe there will always be some scarcity of something, and in particular I don't expect the distance between poor and rich (or low status vs high status) to diminish any time soon. I don't expect a human will be satisfied, ever - there is always more to desire. Earning that will require work in exchange for what the person wants, or perhaps something that can get them what they want (aka money, or some other form of credit). That's a job.

And yes, it is quite possible that those jobs will be much easier, to the extent that you might not consider them jobs at all (is game streaming a job?). But I'm sure most people from 1825 wouldn't consider what we do jobs either.

The transitional period is gonna be rough though, and that's probably the toughest part. People who don't want to change will be hit the most.
👍25🔥126
Does this count as new knowledge? No human knew about these vulnerabilities and now we do

Knowledge discovery is a search problem.

Today as part of our commitment to transparency in this space, we are proud to announce that we have reported the first 20 vulnerabilities discovered using our AI-based "Big Sleep" system powered by Gemini


https://x.com/argvee/status/1952390039700431184?s=46
👍6🤡3