𝔊𝔴𝔢𝔯𝔫@gwernMay 9(Every year I tweet mocking China AI hawks being wrong yet again, I get a frisson of fear—"what if today is finally the day that they drop something as epochal as Vaswani or Brown or...?"
But then, a bird missed pooping on my head by a few feet yesterday.
To live is to risk.)
2,913
103
3.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9"Western researchers can't compete with their access to databases; they can't even read the important research papers all written in Mandarin or using services inaccessible past GFW. This shows the inherent advantage of authoritarian technocracy over decadent liberal democracy!"
4,064
90
2.2%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9It's funny to imagine the alternate history, where Dario Amodei still works for Baidu as a minor PM in the ERNIE division, where American AI is in shambles "because FANG is too short-sighted to invest, privacy rights too strong, and Western Internet too hopelessly fragmented."
1,810
39
2.2%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9There was a year where the Chinese DL giants had developed rapidly for years before, and citation rates were surpassing American, and it looked like they were going to accelerate into DL scaling hyperspace, leaving the West behind & handing it all to Xi.
That year was... 2018.
1,252
67
5.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9Or how about in 2017, when Baidu researchers published a little thing you might have heard of since then, called "scaling laws"? arxiv.org/abs/1712.00409…
1,114
82
7.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 9BTW, one of the funniest things about my challenge here is that it gets easier the further back you go. Like FlashAttention's careful GPU memory hierarchy use allowing much greater NN scaling & speed... you know, like Baidu's 2016 Persistent RNNs (proceedings.mlr.press/v48/diamos16.p…).
812
45
5.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMay 8(This overhang seems to have been mostly eaten up by the rash of music DL startups like Suno and Udio.
They don't seem to have any secret weapon like copyright or some brilliant new arch, so I guess it was just an ordinary 'automation as colonization wave' delay after all?)