I'm also very wary of their analysis method, given classifiers-gonna-classify. We already see it in their example of someone asking why their game is crashing and it buckets them into Computer & Mathematical occupation. I'm guessing the original question was not that of a game developer but rather a game player, so can you really call this an occupational task? Sure it's in that domain, I guess, but in a completely different context. If I'm asking a question about how to clean my dish washer, that's hardly in repairman or industrial occupations.
Still, it's cool they're doing this.
https://trends.google.com/trends/explore?date=today%205-y&ge...
Which suggests that the most common use is as a tutor / cheating on homework.
The troughs in that graph are all during prime US school/college vacation times: Summer, Winter, and Spring breaks. And then magnitude of the fall corresponds to how long the breaks typically are. To me, that makes a lot of sense.
Most of those kids will continue to use it as they graduate, having embedded it in their workflow (unfortunately many will probably fully outsource all thinking to it, having learned a lot less since it did it all for them).
What are you referring to?
https://www.404media.co/ceo-reminds-everyone-eightsleep-pod-...
Not statistically.
Is the soup smooth or lumpy? Striated or uniform? For that matter a soup could (and often does) involve huge soup bones that give it important parts of its flavor, but never show up directly in a spoonful. And you might need something different from a spoon to convincingly rule out some specific rare lumpy ingredient.
The didactic value of sampling the soup pot goes well behind its basic function: correcting the beginner’s misperception that a sample’s statistical power is directly related to population size :)
Since there aren't a billion students in the USA, 35% of them is an impossibility.
If you scale your population above some recognized boundary you aren't sampling in the same space any more. After all the local star density to 1AU tends very strongly to 1. That's not indicative of the actual star density in the milky way.
No, The most important thing is the distribution of the sample size. You have to make sure it isn't obviously biased in some way (i.e You're only surveying students in a university for extrapolation on the entire population of the country). Beyond that, the desired sample size levels off quickly.
5000 (assuming the same distribution) won't be any more or less accurate for 10M than it is for 1M.
Of course, if you just ask everyone or almost everyone then you no longer need to worry about distribution but yeah
Software engineering is a weird niche that is both a high paying job and something you can almost self-teach from widely available free online content. If not self-teach, you can rely on free online content for troubleshooting, examples, etc.
A lot of other industries/jobs are more of an apprenticeship model, with little data and even less freely available on open internet.
I think you massively underestimate just how much data is online for everything, especially once you include books which are freely available on every possible subject (illegally, perhaps, but if Meta can download them for free then so can everyone else).
There's less noise for many other subjects than for software engineering, there's often just a couple rather than 100s of competing ways to do everything. There might just be one coursebook rather than 1000s of tutorials. But the data for self teaching is absolutely there.
Medicine faces two key challenges. First, while research follows the scientific method, much of what makes a good doctor—intuition, pattern recognition, and clinical judgment—is rarely documented. Second, medical data is highly sensitive, limiting access to real-world cases, images, and practice opportunities. Theory alone is not enough; hands-on experience is essential.
Law presents a different problem: unknown unknowns. The sheer volume of legal texts makes it nearly impossible to be sure you’ve found everything relevant. Even with search tools, gaps in knowledge remain a major risk.
Compounding this is the way law is actually practiced. Every judge and lawyer operates with a shared foundation of legal principles so basic they are almost never discussed. The real work happens at two higher levels: first, the process—how laws are applied, argued, and enforced in practice. Then, at a third, more abstract level, legal debates unfold about interpretation, precedent, and systemic implications. The first level is assumed, the second is routine, and only the third is where true discussion happens.
Self-teaching is easier in fields where knowledge is structured, accessible, and complete. Many subjects are not.
Similarly, I would wager most of the useful economics and financial theory that humans have come up with is only known to hedge or prop trading firms.
For some subjects, the entire journal-published academic body of knowledge for it is probably some useless fraction of the whole and university academia is operating nowhere close to the cutting edge. People are probably doing PhDs today on theses that some defense contractor or HFT firm already discovered 20 years ago.
Even things like specialized medical knowledge, I would wager is largely passed down through mentor-mentee tradition and/or private notes as opposed to textbooks. It's unlikely that you can teach yourself how to do surgery just from textbooks. I once had a pathologist's report use a term for a skin condition that was quite literally ungoogleable. The skin condition itself was fairly ordinary, but the term used was outright esoteric and yet probably used on a daily basis by that pathologist. Where did he learn it from?
Not everything is on the Internet.
If the instructions aren’t immediately available, the internet provides connections and forums to find anything your heart desires.
Information wants to be free.
Arbitraging micro-opportunities (or far more likely, deploying insider information masked as HFT or some secret sauce arbitrage) is not economically useful.
Sure, you can learn all about power electronics by yourself. But have some ideas you want to implement? Hundreds to tens of thousands of dollars.
If you meant programming, I agree it could be self-taught, but not SE. SE is the set of techniques, practices, tools that we have collected over decades for producing multi-versioned software that meets a certain reliability rating. Not all of these is freely available online.
Thing is, I’ve never met someone in software with a professional license.
I had about 12 YoE at the time, and my manager didn't realize I didn't have a degree until after I was hired. Apparently it hadn't affected my offer, and he was more impressed than anything.
You say:
> SE is the set of techniques, practices, tools that we have collected over decades for producing multi-versioned software that meets a certain reliability rating. Not all of these is freely available online.
The same way there's no single guide on the internet on how to be the kind of engineer who builds reliable or extensible software, I don't think there's a guide hiding in the average CS curriculum.
Most of it consists of getting repetitions building software that involves the least predictable building block in all of software engineering (people), in all its various forms: from users, to other developers, to yourself (in the future), to "stakeholders", etc.
Learning how to predict and account for the unpredictability in all the people who will intersect with some facet of your software is the closest I've seen to a "universal method" for creating software that meets the criteria you defined.
And honestly I'd be concerned if someone told me you can just be taught some blessed set of tools and practices to get around it... that sounds a lot like not having actually internalized why they work in the first place, and the "why" is arguably more valuable than the tools and practices themselves.
On the other hand, an open and level playing field does not exist in the thirty-some odd years of open markets software development. No one since Seymour Cray has done complete systems design, really.. it is turtles all the way down. You have to get hardware to run on, and the software environment is going to have been defined for that.. CPU architectures and programming languages. People who write whole systems generally do it in teams.
The arrogant and self-satisfied tone of the corporate worker-bee says that there is no such thing as real software engineering skills?
like defining "health" or other broad topics.. the closer the topic is examined, the more holes in the arguments. I am glad I never punched a time clock for Elon Musk, however, all things considered.
I feel very fortunate that the core blender devs had the patience to put up with my stupid amateur mistakes while I learned the skills to become a helpful contributor back in the day.
The contrast is that for a lot of other jobs, the rote tasks are not routinely solvable with free online content in text form.
In marketing, the entire low-end to mid-tier market is gone. Instead of having teams working on projects for small to mid-sized companies, there's now a single Senior managing projects with the help of LLMs. I know multiple agencies who cut staff by 80-90% without dropping revenue.
Translation (of books, articles, subtitles) was never well paid, even for very complex and demanding work. My partner did it a bit on the side, mostly justifying the low pay with some moral bla about spreading knowledge across cultures... With LLMs you can completely cut out the grunt part. You define the hard parts (terms that don't translate well), round out the edges and edge out the fluff, and every good translator becomes two to ten times more productive. Since work is usually paid by the page, people in the industry got a very decent (at least temporary) pay jump, I would imagine around 100%.
Support is probably the biggest one though. It is important to remember that outsourcing ot India only works for English speaking countries. And even that isn't super cheap. Here in Germany, if you don't have back-up wealth, it is your constitutional right to get some support from the state (~1400 euro), but you are obligated to find a job as soon as possible, and they will try to help you find a role. Support was always one of the biggest industries to funnel people towards. I talked to a friend working there, and according to them the complete industry basically stopped advertising new positions, the only ones that are left are financial services. The rest went all in on LLMs and just employ a fraction of the support stuff to deal with things escalating enough.
And that's not even touching on all the small things. How much energy is spent on creating pitch decks, communicating proposals, writing documentation etc? It probably goes up as far as 50% of work in large Orgs, and even if you can just save 5% of your time by using LLMs to phrase or organize, there is a decent ROI for companies to pay for them.
There's just no countervailing force to make these decisions that immediately painful for them. Sectors are monopolized, people are tired and desperate, tech workers are in a basically unprecedented bout of instability.
The situation is super dark from a lot of angles, but I don't think it's really "the overwhelming usefulness of AI" that's to blame here. As far as I can tell, the biggest thing these technologies are doing is providing a cover story for private-equity-style guttings of various knowledge work verticals for short-term profit, which was kind of inevitable given that's been happening across the board in the larger economy, it's just another pretense that works for different verticals.
There are cases where LLMs seem really genuinely useful (Mostly ones that are for and by SWEs, like generating documentation or smoothing some ramp processes in learning new libraries or languages) and those don't seem to be "transformative" at scale yet, unless we count "transforming" many products into buggier products that are more brittle and frustrating to interact with
I'm finding it hard to reconcile this with my own experiences. My whole team ( 5 people ) left last year ( for better pay I guess ) and the marketing agency in germany Im working for had to substitute them with freelancers. To offset the cost they fired the one guy who was hired to push the whole LLM AI topic. We managed to fill one junior position by offering 10k+ more then in their last job. The firm would love to hire people to replace the freelancers. We had to cut stuff lately. But mostly they closed the kitchen which wasn't used due to work from home policy. Definitely don't see any stuff reduction due to automation / LLM use. They still pay (external) people 60€ per written text/article. Because clients don't like LLM written stuff.
- Synchronous translation at political/economic events still needs a personm as it ever did - LLMs are nowhere near the level to be able to translate fine literature at a high enough quality to be publishable - Translating software is still very hard, as the translator usually needs a ton of context/reference for commonly used terminology - we partnered with a machine translation company, and what they produced sucked balls.
I have friends who work as translators, and we make use of translation services as a company, and I haven't seen the work going away.
This just isn't true, it's nowhere close.
If this was true we would see the results in productivity and unemployment stats. We don't though, so far the effect hasn't registered.
I love Claude, but let's not ignore that in the LLM race, they're not exactly the leading player.
It is faster than the reasoning/chain of thought models. With current o1 and DeepSeek though I haven't logged into Claude in a few weeks.
I have no inside knowledge but I am kind of expecting Sonnet chain of thought any day now and I am sure that will be incredible.
Anthropic's llms always (always? at least since 2) have a distinctive "personality". I obv don't know how to quantify it or what "it" really is, but if you've used it you might know what I mean. Maybe that "personality" is conducive to swe?
We aren't leaving MS Office or Adobe because they already pushed out some minimal innovation. But what about the products you don't even know about? For lawyers, doctors, logistics, sales, marketing, wood workers, handymen? In Europe or Asia?
New product by bringing true innovation could easily push out legacy business by "shiny new thing"(AI) and better UX alone. A lot of software in these areas simply hasn't improved for 10 years - with a great idea and a dedicated team it's a landslide waiting to happen.
Google Gemini integration into their docs/sheets/slides and Gmail perhaps will show different demographics in a few months, and that is yet before we heard from OpenAI.
Maybe these models will get better as they’re given more context and can understand the full stack but for now they cannot.
And this is just with code where it already has billions of examples. Nevermind any less data-rich fields. The models still need to get smarter.
We already have thousands of geniuses working across our economies and teaching our youth. The best of our minds have every year or so been given a global stage in Nobel speeches. We still ignore their arses and will ignore it when AI tells us to stop fighting or whatever.
The real issue here is that wafer scale chips give 900,000 cores, and nothing but embarrassingly parallel code can use it - and frankly no coder I know writes code like that - we have to rethink our whole approach now Moores law is over. Only AI has anything like ability to use the processing ability being built today - the rest of us can stick to cores from 2016 and nothing would change.
Throwing hundreds of billions at having a bad way to program 1 million cores because we have not rethought software and businesses to cope seems wrong - both because “Whitey” can spend it on better things but also because it is an opportunity - imagine being 900,000 times faster than your competitors - what’s does that even mean?
Edit: Trying to put it another way - there are two ways AI can help us - it can improve cancer treatments at every stage of medical care, through careful design and creation of medical AI models that can slowly ratchet up diagnosis, treatment and even research and analysis. This is human organisations harnessing and adapting around a new technology
Or AI can become so smart it just invents a cure for cancer.
I absolutely think the first is going to happen and will benefit the denizens of the first world first. The second one requires two paradigm shifting leaps in the same sentence. Ten years ago I would have laughed in Anthropics face. Today I just give it a low probability multipled by another low probability- and that is an incredible shift.
I feel like this has less to do with what LLMs are best at and more to do with which folks are mostly likely to spend time using a chat bot.
Minor nitpick. Use of the word 'spend' as a noun is not widespread and not well known.
Is the noun spend rare?
ChatGPT said:
The noun "spend" is relatively rare compared to its more common form as a verb. While "spend" is widely used as a verb (meaning to give money or time for something), as a noun, it refers to an expenditure or the act of spending, and it’s not as commonly encountered.
In most contexts, people would use alternatives like "expenditure," "spending," or "outlay" instead of "spend" as a noun. That said, it is still used occasionally in certain contexts, especially in financial or informal language.
The majority of audience and posters of ycombinator are not in that industry group, right?
And it’s not just them. To me this trend screams “valuations are too high”, and maybe hints at “progress might start to stagnate soon”.
https://www.anthropic.com/news/the-long-term-benefit-trust
https://time.com/6983420/anthropic-structure-openai-incentiv...
Then the people who funded / trained this "justice" out of their good heart, would actually have leverage, in terms of concrete power.
It's a much more subtle way to capture power, if you can replace the judges with your software.
Brave new world, indeed...
The whole thing about no ethical consumption under capitalism is a just a way to enjoy the conveniences of capitalism on a moral high ground. It's totally doable, you just might not enjoy it haha.
The camel's gotta get its nose in the tent somehow.
It wouldn't specifically brag about doing it, while leaving out that they were specifically dealing with Palantir, because they know what they're doing is unethical: https://www.anthropic.com/news/expanding-access-to-claude-fo...
Being available for use by militaries is incredibly irresponsible, regardless of what scope is specifically claimed, because of the inherent gravity of the situation when a military is wrong. The US military maintains a good deal of infrastructure in the US; putting into their hands an unreliable, incompetent calculator puts lives at risk.
It would be structured as a non-profit (there are no teeth to a PBC; the structure is entirely to avoid liability, and if you have no trust in the executive body of an organization, it has zero meaningful signal).
It would have a different leadership team.
It would have a leader who could steelman his own position competently. Machines of Loving Grace was less redeeming than Lenat's old stump speeches for his position, despite Amodei starting up in an industry significantly more geared for what he had to say, and Lenat having an incredibly flexible sense of morality. Its leader would not have a history working for Chinese companies and jingoistically begin advocating for export controls.
It would have different employees than the people I know who are working there, who have a history of picking the most unethical employers they can find, in a fashion not dissimilar to how Illumination Entertainment's "Minions" select employers.
There are sane investors that prefer investing in companies that adopt these corporate structures. Based on data, those investors see public benefit corporations as more profitable and resilient. They are able to attract employees and customers that would otherwise not be interested or might be less interested.
What is "the agency problem"?
In modern management compensation theory (https://saylordotorg.github.io/text_introduction-to-economic... ) this is key to why executive compensation has increased much faster than workers in the last 50 years.
Stock based compensation mix evolved from this thesis, and quite common in the valley and why almost all OpenAI staff wanted Sam Altman back even though the non profit board did not.
Aligning key talent's compensation to enterprise value is only viable in unrestricted for profit entities any other structure with limits (capped profit, public benefit corporation, non profit, trust, 501c's etc) does not work as well.
Talent will then leave to a for-profit entity who can offer better compensation than a restricted entity can because they share a % of their enterprise value which restricted ones either cannot or not have same liquidity/value [1] etc.
---
[1]This is why public companies are more valuable for RSU/options than private companies, and why cash flow positive companies like Stripe still raise private money to just give liquidity to employees .
It's not the best choice, it's spacer's choice!
Semi-relevant sidenote: ChatGPT, spent $8m on a super bowl commercial yesterday just to show cool visualizations instead of any emotional product use case to an ultra majority audience has never had a direct experience with the product.
These companies would be best served building a marketing arm away from the main campus in a place like LA or NY to separate the gen pop story from that of the technology.
I think AI in its current iteration is going to settle into being like a slightly worse version of Wikipedia morphed with a slightly better version of stackoverflow.
At the base of LLM reasoning and knowledge is a whole corpus of reasoning and knowledge. I am not quite convinced that LLMs will breach the confines of that corpus and the logical implications of the data there. No “eureka” discovery, just applying what we already have laying around.
Just to try it out, I uploaded the paper to DeepSeek-R1 and wrote a paragraph on the desired algorithm, that it should code it in Python and that the code should be as simple as possible while still working in exactly the way as described in the paper. About ten minutes later (quite a long reasoning time, but inspecting the chain of thought, it did almost no overthinking, but only reasoned about ideas I had or should have considered) it generated a perfect implementation that worked for every single test case. I uploaded my own attempt, and it correctly found two errors in my code that were actually attributable to naming inconsistencies in the original paper that the model was able to spot and fix on the fly. (The model did not output this, this I had to figure out myself.) I would have never expected AI to do that in my lifetime just two years ago.
I don't know whether that counts as "novel" to you, but before DeepSeek, I also thought that Copilot-like AI would not be able to really disrupt programming. But this one experience completely changed my view. It might be the case the model was trained on similar examples, but I find it unlikely just because the concrete algorithm cannot be found online except for the paper.
Combined with the old “nothing new under the Sun” maxim, in that most ideas are re-hashes or new combinations of existing ideas, and you’ve got a changed landscape.
But this is not the majority of what software developers are doing and working on today. Most have a set of features or goals to implement using code satisfying certain constraints, which is what current reasoning AI models seem to be able to do very well. Of course, this test was not rigorous in any meaningful way, but it really changed my mind on the pace of this technology.
Plenty of value is already added just by converting unstructured data to structured data. If that is all LLMs did they would be still be a revolution in programming and human development. So much manual entry and development work has essentially evaporated overnight.
If there was never a chat based LLM "agent" LLMs just converting arbitrary text to structured JSON schema would be the biggest advancement in comp sci since the internet. There is nothing equivalent that existed before except for manual extraction or rule based hard coding.
Judging LLMs based on some criteria of creativity or intuition from a chat is missing the forest for the trees.
Well over 90% of work out there is not novel. It just needs someone to do it.
And if the flywheel is that AI begets AI exponentially in an infinite loop then those share certificates you own probably won't be worth much. The AI won.
Coincidentally, Anthropic's mission is AI safety.
That said, this doesn't seem like completely superfluous "fat" like what Mozilla does. It seems very much targeted at generating interesting bits of content marketing and headlines, which should contribute to increasing Anthropic's household brand name recognition vs. other players OpenAI, as well as making them seem like a serious, trustworthy institution, rather than a rapacious startup that has no interest in playing nice with the rest of society. That is: it's a good marketing tool.
My guess is that they developed it internally for market research, and realized that the results would make them look good if published. Expect it to be "sunset" if another AI winter approaches.
I don't read it as fear AI, I read change is happening because of AI.
I suspect that a) will get better over time. I also suspect that b) can be addressed by a pre-programmed prompt-flow that uses a LLM to gather requirements from a PM and ask probing questions to get a well-defined scope and agree on how edge cases should be handled. It doesn't seem far-fetched that a AI also would be able to call out small requirement changes that might allow for much simpler/faster solutions.
At no point do I see an actual elevator pitch/tl;dr/summary of what the frak this index actually is, except that it’s part of some effort to track AI adoption. It just rains down figures about which industries are using how much AI without first grounding the new concept they’re introducing.
When you say you have a new economic index, you need to give me a number, how I should interpret that number, and where it comes from. I don’t see that.
GDP: measure if a country’s total economic output by adding up end product purchases.
CPI: general price level by taking a weighted average of prices throughout the economy
Big Mac index: how expensive goods are in a country relative to the US by reference to the local cost of a Big Mac, converted through the exchange rate.
Here I expect something like “the economic output-weighted fraction of production taken over by AI”, but instead it’s just a list of AI adoption by industry.
Why introduce an index and not headline with a definition of an index? Which audience prefers that?
One thing I hope they'll correct going forward is inclusion of API usage. Anecdotally, I only use Anthropic models via Cursor. So none of that usage shows up in here. I'd expect that specialized tools/interfaces like Cursor will grow and thus more usage will shift to API. It would be a shame to miss out on that in the data set.
Even if they don’t train on the data they could break it down by user agent / API client ID and infer something about cursor traffic.
If they would just look at their product, they'd see that it literally says it in the model description that Opus is better for writing. If you advertise one of your models as geared for task X, the insight that people use it more for task X isn't really an insight.
It got to the point where I was forced to go to ChatGPT if I wanted to just be left alone and get my answers. Then o1, o1 pro, o3-mini and Deep Research dropped and I have almost no reason to go back to Claude anymore. These days my main use case is using it as part of Cursor for code generation / co-piloting. But that's it.
If Anthropic wants to get me back, they should treat me as an adult again.
At day job - finance/office stuff - essentially zero traction despite everyone having enterprise AI subs & brainstorming sessions about use cases etc.
Then go home & do some hobby coding and suddenly it's next level useful.
It's not that the one is harder than the other, but rather that many jobs don't have an equivalent to a code base. The AI could I think grok parts of the job but typing up relevant content & what is required would take longer much than doing the task. There is nothing there to copy & paste for a quick win in the same way as code.
I'd like to see a comparison to the data 6 months ago, before Sonnet 3.5. I suspect the automation rate will track up over time, but that may mostly be captured by API usage which isn't in the dataset.
https://openrouter.ai/rankings
It seems like the most popular choice for API access.
Unless they are trying to mislead competitors (who don't look at their own numbers...), they have no reason at all to game those numbers there.
But, we found out that OpenAI is/was gaming benchmarks (https://news.ycombinator.com/item?id=42761648) and that seems to be forgotten history now - so I don’t know.
But on the other hand, how would we found out that they've gamed the numbers, if they were gamed? Unless you work at Anthropic and have abnormally high ethics/morals, or otherwise private insight into their business, sounds like we wouldn't be able to find out regardless.
On page 7 of the paper there's the diagram "Minimum fraction of tasks in use". On the left side about 75% of occupations use at least one tasks and on the right side the maximum is some occupation that uses slightly more than 95% of the tasks.
Cool.
Here I start to wonder how they got that graph.
At the start of section 3. Methods and analysis on page 4 it's said:
> To understand how AI systems are being used for different economic tasks, we leverage Clio [Tamkin et al., 2024], an analysis tool that uses Claude [Anthropic, 2024] to provide aggregated insights from millions of human-model conversations. We use Clio to classify conversations across occupational tasks, skills, and interaction patterns, revealing breakdowns across these different categories. All analyses draw from conversation data collected during December 2024 and January 2025.
So this means they use real people's chats to make these estimations. I don't know Clio, but perhaps they did this? They sample chats from individuals, and some individuals never chatted and some individuals delegated all their work to Claude. But I wonder how they estimated the total numbers of tasks of an individual.
I am sure these answers are found by really going deep and reading the cited sources and running some experiments yourself, but I can't be bothered, sorry.
Again, I really wonder how much of total population use AI? How much? How do parts of population differ? Can this be found out at all?
That's nice. My main prompt has a hint suggesting when I refer to work. I do this so Claude can assume my tech-stack. Of course I exclude or mask confidential data, but it's still nice this stuff gets filtered.
Instead, this is a super rare and valuable look into who/what/how folks are doing with Claude across millions of conversations, nicely categorized by function and task.
The economic impact data (i.e. wages) that they might overlay onto that usage data is a separate thing that -- of course -- is more subjective and likely to be part of some PR machinery about the public value of AI etc.
But as to sharing the raw usage data itself - we should applaud it! What a useful window into how this stuff is being used in the real-world.
Will OpenAI release similar data? Why or why not? I hope they will. It elevates the discussion for everyone, and frankly would be 'good business' if it gets people thinking about who/how AI could be used at their organization with more granularity.
https://www.anthropic.com/legal/privacy
They use personal data “to improve the Services and conduct research.” Your chat interactions (that is, your "Inputs and Outputs") are included in the data they collect. and: "If you include personal data in your Inputs, we will collect that information and this information may be reproduced in your Outputs."
You don't need to look for loopholes, it's spelled out plainly.
This is Anthropic we're talking about, they're rightly recognized as the 'ethical' AI company.
Putting it on your company blog is marketing, always.
The only exception is PornHub insights, because they don't need to advertise and the people reading are there for the insights.
Instead of getting robots that do the laundry and clean the kitchen we got robots that do token work in a showroom at a BMW factory.
All the knowledge surfaced through LLMs was already mostly available online, they just make it more cohesive. It is better search.
Devs have figured out that creating a login page over and over is not a job, and that is now somewhat automatable.
Also everyone hates the name Devin now.
There are things we say or write openly without caring who hears or reads it. Things we share with friends and family. Things we share with our closest friend, partner, or therapist. Finally there is our private heart which holds the things we’re not comfortable sharing with anyone.
I worry that LLMs are sufficiently anthropomorphic but not "real" enough to be privy to these latter thoughts. In the wrong hands this data is catastrophic at the individual level.
We have github copilot and augment available for making suggestions inside vscode. I don't think either are anthropic - but I'm sure they offer a similar feature. I wonder if they count EVERY suggestion offered as a "use"? Sometimes it really helps, but it makes plenty of suggestions I ignore. Does it essentially treat every keystroke as a use then, since it updates / re-suggests sometimes with every keystroke?
Probably an overall smart move, since claiming to be doing economics sometimes leads to being positioned to make policy favorable to oneself
HN and programming subreddits rave about Claude for coding, so it’s possible that a lot of developers use Claude for coding, but the average AI use case may weight differently on ChatGPT or Grok.
In my experience, if ChatGPT can’t solve a coding problem, I try again on Claude. Although this happens less frequently since upgrading to o1-pro and o3-mini-high. And I haven’t used Claude for anything else.
- but then later mention they didn't include API queries in the data, only Free and Pro queries on the website. Most "full automation" type queries would use the API, not the web interface (and nowadays probably wouldn't use Claude anyway due to how expensive its API is compared to Deepseek R1 or O3 Mini).
are there any other good reads on the Economic impact of AI that is not just hype or marketing but more considered analysis of data / indicators?
I work both as a software developer and a psychologist, and I love tinkering in the shop with welding and mechanics. It is extremely obvious that using AI is more available and appropriate when coding, as you're often in front of a very capable computer with a good interface to interact with. When I am a psychologist, it's not as fitting to bring out a computer and input prompts. And when I'm working in the shop, it's more of a hassle to grab the phone and ask a question.
Types of work and knowledge work, obviously, are ripe for integration with AI tools, but I think the pure ease of use/availability is a major factor. Sometimes two seconds of extra work to do something is the difference between not doing it and doing it.
I'm a heavy user of dictation and voice-assisted features on mobile phones, but it just doesn't cut it when you have to fight with the phone to select text and copy-paste. (The clicking of selected text to copy is so temperamental, and why the hell is the contextual menu so inconsistent after you've selected text still! I selected the text and waited for the tooltip to appear, but it only does so if it feels like it still.)
Anyways, "ease of use for a given profession" vs "Actual usage" is also important, is my point... [Edit for spelling]
It's also often not useful because it's more work to spell out every other thing that dictation is not good for. For instance, If i want to ask "What does the ICD-10 code for F320 stand for?", it might transcribe it as "What does IceDen code for F3. 120. stand for?" When I have to start messing around with the keyboard anyways, it's double slow compared to just typing on a physical keyboard.
Many times when I need input, the thing in question is a technical term. This is as true in psychology as in coding. So it must have a way to correctly understand the uncommon terms, for instance, a predictable way to spell out or ask for clarification. Same with regards to coding terms. What is the chance that it correctly understands?"Explain #include <stdio.h> syntax"?
That said, it's awesome as long as the question uses common and predictable words. It's just surprising how often it uses uncommon terms. Thus, it's awesome, but limited. The best use case is when I think of a topic while walking the dog that I want more information on. Then I can have a cool conversation with it while walking.
On another note: It went completely off the rails for me a month ago and stopped giving useful information after it created a memory that I "want short, concise, factual, and to-the-point responses," which is true, but it went from informative to almost giving me the silent treatment and answering show short that it was useless. I feel it never got completely back to normal after removing that memory.
"A Major Law Firm's ChatGPT Fail" https://davidlat.substack.com/p/morgan-and-morgan-order-to-s...
"Lawyer cites six cases made up by ChatGPT" https://arstechnica.com/tech-policy/2023/05/lawyer-cited-6-f...
"AI 'hallucinations' by ChatGPT end up costing B.C. lawyer" https://www.msn.com/en-ca/news/world/ai-hallucinations-creat...
The list goes on and on. Maybe there's a bespoke RAG solution that works...maybe.
In what year would you think it will be acceptable and why?
LLMs are tools, I don't see anything wrong with using them in any occupation as long as the user is aware of the limitations.
"Claude is fully capable of acting as a Supreme Court Justice right now."
I see only few images available to download
This may just be my ignorance, but it seems that distributed version control is a highly valuable technology which hasn’t penetrated that well into law. If this is true—my evidence is only anecdotal, talking with lawyers—then it should provide partial insight that translates into the problem of LLM adoption.
before it was smart contracts will replace lawyers & contracts. DeFI will replace traditonal finance.
now it's AI will replace jobs - because it can autocomplete Javascript and guess the next sequence of english / {{whatever}} lang words.
hell AI won't even replace CRUD software engineers who make software based on some business rules.
Here's the reality: You are getting displaced.
Companies like Anthropic and OpenAI screaming about AGI are repeatedly lying to you as they raise more money while Meta (who are laying off staff today), Salesforce (announced layoffs as well) [0], Klarna (not hiring), etc are admitting this in front of us (and laughing at all of us).
Do you get it now? I'm giving you a 5 year head start of their plan before it becomes a complete catastrophe for the market. [1]
[0] https://news.ycombinator.com/item?id=42975813
[1] https://www.weforum.org/publications/the-future-of-jobs-repo...
The company seems to be operating in a classic failure mode: being more concerned with its industry than its competitors and customers.
See the first few points here: https://brief.bismarckanalysis.com/p/27-insights-from-three-...
Where I could be wrong is the CEO is technical, however most of what I hear from them is about industry and social impact instead of product.
Have you considered that, since they are a public benefit corporation staffed with people who left OpenAI in part due to more capitalistic pursuits, this is by design?