2. minimax m1's tech report is worthwhile: https://github.com/MiniMax-AI/MiniMax-M1/blob/main/MiniMax_M... while they may not be the SOTA open weights model, they do make some very big/notable claims on lightning attention and their GRPO variant (CISPO).
(im unaffiliated, just sharing what ive learned so far since no comments have been made here yet
It would've been fun to see them name their models like Apple chips: M1, M1 Pro, M1 Ultra.
1. https://github.com/MiniMax-AI/MiniMax-M1/issues/2#issuecomme...
People have tested it. Q8 has essentially no drop in quality, Q4 is measurable but still not realistically a problem. If this impacts you, just pay for the commercial saas option.
It's a fair point, but the conclusion is 'i dont know'
I could assume that it gets better because it'll keep to simpler code.
Unless they want to train their own model, buying this for inference for $250k is unnecessary and still isn't enough for a full production deployment.
I like these people already.
* A Singapore based company, according to LinkedIn. There doesn't seem to be much of a barrier to entry to building a very good LLM.
* Open weight models + the development of Strix Halo / Ryzen AI Max makes me optimistic that running great LLMs locally will be relatively cheap in a few years.
They're also planning to IPO at HKEX in Hong Kong soon
https://www.scmp.com/tech/tech-trends/article/3314819/deepse...
If anyone has any suggestions of people thinking about this space they respect, I'd love to listen to more ideas and thoughts on the developments.
I'm interested in other opinions. I'm no expert on this stuff.
* double the bandwidth;
* half the compute; and
* double the price for comparable memory (128GB)
compared to the Strix Halo.
I'm more interested in the AMD chips because of cost plus, while I have an Apple laptop, I do most of my work on a Linux desktop. So a killer AMD chip works better for me. If you don't mind paying the Apple tax then a Mac is a viable option. I'm not sure on the software side of LLMs on Apple Silicon but I cannot imagine it's unusable.
An example of desktop with the Strix Halo is the Framework desktop (AI Max+ 395 is the marketing name for the Strix Halo chip with the most juice): https://frame.work/gb/en/products/desktop-diy-amd-aimax300/c...
There is nothing fundamentally new in having freedom in edge of societies. Yes it can lead to horrible situation, like someone kill neighbors, using the single handable bright new tool available to all. But that's far less of a concern than having the powerful new tool staying in full concentrated control of the greediest humans out there, who will gladly escalate any hindrance to genocide whenever something doesn't fit their perspective.
Nah, this is a Shanghai-based company.
If someone knows of a trustworthy article that states it outright, please feel free to share.
> RL at unmatched efficiency: trained with just $534,700
2. It's a legal requirement in some jurisdictions (e.g. https://www.gov.uk/running-a-limited-company/signs-stationer...)
3. It's useful for people who may be interested in applying for jobs
I can't say I remember any model/weights release including the nation where the authors happen to live or where the company is registered. Usually they include some details about what languages they've included to train on, and disclose some of their relationships, which you could use for inferring that from.
But is it really a convention to include the nation the company happen to be registered in, or where the authors live, in submitted papers? I think that'd stick out more to me, than a paper missing such a detail.
Where do you see that? e.g. I just checked https://openai.com/about/ and it doesn't say where they are based. I have no associations either way, but I usually have to work hard to find out where startups are based.
OpenAI, L.L.C.
1455 3rd Street
San Francisco, CA 94158
Attn: General Counsel / Copyright Agent
Is this what you are talking about?2. This is a requirement for companies registered in the UK. You should also read your own link, it doesn't say anything about the company's presence on 3rd party websites.
3. This is such a remote reason it's laughable, there are plenty more things that are more relevant to potential job applications, such as whether they are hiring at all or not.
You just want them to mention it because it's a Chinese company. If they were American, Mexican, German or Zimbabwean you wouldn't give the slightest fuck.
I don't know about your OP, but even as a layperson, I personally like to check where my things come from. And yes, I am mostly curious about which wide geopolitical region the thing is from.
In case of IT projects, it matters when I want to include them in a project.
Also, thanks for putting words in my mouth. If they were Mexican or Zimbabwean I would find it very interesting to see a roughly SOtA model coming from that country.
They state HQ in Singapore on LinkedIn, and San Francisco elsewhere. Compared to this, it's outright disingenuous that they don't mention that they are a Chinese company.
As a layman, I'm mostly indifferent to this information.
If I were a project manager, this would be vital information. And the people running projects know this. So it begs the question: why not disclose, and why obscure it?
They named themselves after a classic ai algorithm.
https://en.wikipedia.org/wiki/Claude_Shannon#Shannon's_compu...
https://www.bloomberg.com/news/articles/2025-06-18/alibaba-b...
Alright, so it's 87.5% linear attention + 12.5% full attention.
TBH I find the terminology around "linear attention" rather confusing.
"Softmax attention" is an information routing mechanism: when token `k` is being computed, it can receive information from tokens 1..k, but it has to be crammed through a channel of a fixed size.
"Linear attention", on the other hand, is just a 'register bank' of a fixed size available to each layer. It's not real attention, it's attention only in the sense it's compatible with layer-at-once computation.