𝔊𝔴𝔢𝔯𝔫@gwernMar 5(Er, 3em for "Andrej"-6 obviously. Sorry, it's late and all my gumption was eaten by MJ/DALL-E 3 UI bugs & self-inflicted errors. Anyway if you don't believe me on any of this, just look at the source, C-f for 'width:', and think about it a little.)
𝔊𝔴𝔢𝔯𝔫@gwernMar 5You can also see that in Musk's emails, it's doing redaction *per word*. If you leak the exact length in characters of every word in a paragraph with this much context & known authors, you can probably use a LLM to infer the entire paragraph!
4,981
396
8.0%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 5The redaction tool makes the second-worst mistake all redaction tools do by doing length-based redaction: it's a 2:1 character:em ratio. So you can see that the 2.5em redaction is 'Andrej', and the 5 and 5,9 redactions are 'Demis' and 'Demis Hassabis'.
6,407
360
5.6%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 5One of OA's most important challenges is infosec and detecting things like side-channels.
So it's unfortunate that they used a totally broken redaction tool which lets you trivially figure out that Elon is forwarding emails from Hassabis & Karpathy. (The HTML isn't even valid!) twitter.com/OpenAI/status/…
33,546
2,262
6.7%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 5Anyone notice Claude-3 Sonnet (haven't tried Opus) seems really good at all the classic tokenization tasks like counting letters, explaining puns, pronunciations for made-up words, unscrambling anagrams etc?
Did the paper mention anything about that...? twitter.com/hahahahohohe/s…
18,294
625
3.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 5(ChatGPT-4: fails drastically to understand or vary it. Claude Sonnet (didn't try Opus): moderate understanding, tried to vary as haiku initially, waka variations sensible but none were improvements and wandered away from theme.)
1,905
28
1.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 5'Buddhism in the New Year' (presented 4 Usui Reiwa 6 as part of a sequence on the auspicious occasion of the accession of a new foundation-model):
"Tumble, Tide, and wind;
then iron devils beat it
in the hottest hell,
until, its karma cleansèd—
reborn is my loincloth."
2,188
22
1.0%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 3Also interesting to realize that you recognize so few names (even excluding the dumb entries like the endless reams of sportsball athletes) because most of them are *still* alive!
It feels like most of history happened since 1900, but that cohort is very backloaded, death-wise.
2,558
33
1.3%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 3It's interesting to go to the WP article for day-of-year like 'January 2' & look at the 'Deaths' section (en.wikipedia.org/wiki/January_2…). There's usually an enormous imbalance between pre-1600/1600-1900/1900-now.
Gives you a sense of the distribution of historical records + people.
2,891
69
2.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 2Can we do planning or search with an AUNN? Well... maybe? Although the question leads me to a pretty idea one involving defining an AUNN over the indices of a virtual game tree in order to do genuine tree search but without any tree: gwern.net/aunn#pondering
2,494
37
1.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernMar 2Yeah, still working on this. Nightmare feature...
Other changes:
- moved the YAML annotations to a custom new 'GTX' text format which is so much easier to write HTML snippets in (and far faster to parse)
- .page->.md extension
- image-focus.js appearance+behavior improved
2,276
21
0.9%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 27A Finn tells me that, for some reason, Finnish criminal records are erased for all nonagenarians.
…are you thinking what i'm thinking…
𝘖𝘤𝘦𝘢𝘯’𝘴 90: 𝘛𝘩𝘦 𝘓𝘢𝘴𝘵 𝘏𝘢𝘳𝘳𝘢𝘩
2,395
60
2.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 26This will, needless to say, be really good for classics studies in the long run. But not because new text will answer things (though it will)—successful academic communities produce questions, not answers.
2,484
81
3.3%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 25(Also nifty: Jorge Luis Borges & G. K. Chesterton died on the same day! They would've liked that.)
2,830
47
1.7%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 24Apropos of an entirely unrelated project, did you know today is the 23rd anniversary of Claude Shannon's death? He only died in 2001.
Hard to believe, given to what an extent we live in Shannon's world now, but computing is 𝘵𝘩𝘢𝘵 young.
4,342
153
3.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 23I remember as a kid running into a cryptographic hash article for the first time, and staring. "Well, that sounds like a uselessly specific thing: it turns a file into a small random string of gibberish? What good is *that*...?" And increasingly 😬😬😬ing as I followed the logic.
5,422
160
3.0%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 23You can reconstruct Luhn's invention of hash functions for better array search from simple check-digits for ECC and see how he got there (spectrum.ieee.org/hans-peter-luh…), but it gives no idea of the sheer power of one-way functions or eg building PK from sponges. codahale.com/the-joy-of-dup…
3,105
165
5.3%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 23Hashs are an intellectual miracle. Almost useless-seeming, they turn out to do practically everything, from ultra-fast 'arrays' to search to public key cryptography (!).
Yet, even Knuth can't find any intellectual forebears pre-1953! gwern.net/doc/cs/algorit… Seems out of nowhere.
𝔊𝔴𝔢𝔯𝔫@gwernFeb 21(What's really hilarious about all this is that even though we've spent a painful month on this, to a reader, this all looks 𝘦𝘹𝘢𝘤𝘵𝘭𝘺 the same—images already displayed and could be popped up! PDFs & web pages already popped up! Videos already popped up! What's new here?)
4,541
84
1.8%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 21(Also, LibreOffice by default always dumps the convert doc into ./... along with all files inside the document. Apparently they *used* to do the sensible thing of inlining them into the HTML, as I expected it to, but they changed, for unclear reasons. At least there's a option.)
2,947
16
0.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 21A nightmare feature cross-cutting across all systems, while hitting a bunch of random bugs. 180MB HTML files. Linux Chromium PDF view ignores Adobe commands but only in iframes. LibreOffice crashes on one specific spreadsheet. No mobile PDF. Safari event bugs. Nitter error lies.
1,738
26
1.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 21We also finally fixed the <video> poster problems & made them clickable *and* lazy (believe it or not, they implemented posters without lazy image options); provide HTML versions of doc/docx/csv/xlsx to render in addition to PDF (but disabling PDF on mobile bc lol mobile); etc.
1,314
14
1.1%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 21eg scribe.rip/the-forgotten-… is basically ~180MB of animated GIFs.
This has 3 benefits: we can transclude the .html, which is usually <1MB, and the images/videos won't immediately download; no text encoding overhead (−18MB); & we can optimize the files eg w/gifsicle (−15MB?).
1,501
47
3.1%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 21One side-effect of expanding file-transcludes to documents is that the HTML snapshots can be 182MB. A user should not download 200MB just by scrolling a little out of curiosity!
So, we have a script to unpack SingleFile snapshots & make them lazy: github.com/gwern/gwern.ne…
888
40
4.5%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 15𝘔𝘢𝘥 𝘔𝘢𝘹 3: 𝘉𝘦𝘺𝘰𝘯𝘥 𝘛𝘩𝘶𝘯𝘥𝘦𝘳𝘥𝘰𝘮𝘦 is our 𝘓𝘰𝘩𝘦𝘯𝘨𝘳𝘪𝘯.
I am taking no questions on this fact.
4,058
87
2.1%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 13Another useful script: just asking GPT-4-V if an image would look better with some more whitespace margin: github.com/gwern/gwern.ne…
(I often upload copied figures or screenshots or generated graphs where they really need another 50px around the edges to look good.)
1,695
107
6.3%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 13That is, a LZMA compressor may not be good enough because it is still weak on small variations of boilerplate, while a LLM would be able to see through that and compress it all away, revealing the real redundancy due to poor design/language/ecosystem.
1,842
11
0.6%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernFeb 13blog.andrewcantino.com/blog/2012/06/1… Although if Javascript is the 2nd-most incompressible language just naively looking at source compression, this idea may need some work.
JS may be very complex, but most of it seems 'accidental'. Need corpus metrics? Smarter LLM-based compression metrics?
2,146
30
1.4%
View post activity
You've reached the end of posts for the selected date range. Change date selection to view more.
Engagements
Showing 28 days with daily frequency
Engagement rate
4.7%
Mar 8
3.0% engagement rate
Link clicks
1.7K
Mar 8
1 link click
On average, you earned 59 link clicks per day
Retweets without comments
0
Mar 8
0 Retweets without comments
On average, you earned 0 Retweets without comments per day