𝔊𝔴𝔢𝔯𝔫@gwernNov 30(I would test this on GPT-4 but since they removed all of the BO-related options AFAICT from the Playground and I'd assume the API, I would have to do it manually and... wait, do they even provide enough logits these days that you *could* implement BO sampling?)
3,206
56
1.7%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 30Q: does best-of sampling still work on recent LLMs like GPT-4?
BO=20 was a great trick for GPT-3 and helped my samples a lot back in 2020; but I haven't seen anyone using it lately, and I wonder if the instruction/RL tuning & 'flattened logits' have rendered it useless now?
7,731
130
1.7%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 29(This observation brought to you by someone commenting while reading _The Big Book of Cyberpunk_ that 'It was at that point that I found myself thinking "/clippy is better than most of this anthology"'.)
3,247
69
2.1%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 29SF is dead because Nature (& Arxiv) offer daily freaks & sports to which writers can add nothing. What interest have the Borg when we see Flamingo, or CLIP as the eye of the technium? What Lem alien so disquieting as an agnosic GPT-4 in your browser? What need Vinge with Muzero?
11,128
430
3.9%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 25twitter.com/gwern/status/1… Further, the existence of 'petty officers' and 'chief petty officers' implies the existence of 'grand officers' and 'least grand officers' and a total ordering where 'petty' and 'grand' officers approach a hypothetical 'officer' rank.
7,009
73
1.0%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 23(That is, it's obviously unethical as hell and a bad look, I'm just wondering if sponsorship by a nonprofit like the Lannan Foundation makes it anything more. I don't have a good grasp on how political a 501(c)3 is allowed to get or what sort of tactics they're allowed to use.)
4,409
74
1.7%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 23Q for lawyer/politics/policy types: if a US 501(c)3 nonprofit pays people to edit Wikipedia undeclared (violating multiple policies & evading bans), is this anything worse than a CFAA violation (like most website abuse), or just protected political speech? reddit.com/r/media_critic…
7,031
304
4.3%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 20I remain dissatisfied with the link-underlining system as rather opaque logic and probably a big obstacle to new readers...
So we're changing semantics: partials now also get dotted underline, so dot = popup *about*; hook = popup *of*.
Also, WP icons back, but very very small. pic.twitter.com/zO8jsZHdDf
𝔊𝔴𝔢𝔯𝔫@gwernNov 19Today's exercise: an illustration for astralcodexten.com/p/the-onion-kn… of a time-traveler onion-knight in a post-apocalyptic wasteland with an onion on his belt, as will be the style at the time.
...not successful. Neither DALL-E 3 nor Midjourneyv5 seem able to do onion-helmet-belt. 😮💨
1,093
48
4.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 16DALL-E-3 comic test on the theme of cats knocking stuff over:
doable if you go frame by frame instead of trying to do it single-shot, but quite difficult because it keeps screwing up the text... I resorted to brute forcing and cropping out a bit in GIMP. pic.twitter.com/EaPUz2G1mG
𝔊𝔴𝔢𝔯𝔫@gwernNov 9Minimal perfect hashes are surprisingly cheap to create: 𝑂(n) time!
But I walked GPT-4 through planning how to do it as an API, and it went through how to set up a little Python daemon for this, which would call my script, & hook it up to nginx; and I think that's more fun.
𝔊𝔴𝔢𝔯𝔫@gwernNov 6One possibility here for doing inversion statically for WP thumbnail popups is to precompute it on all existing WP links (probably <50k) and then ship a Bloom filter (<50kb) or a minimal perfect hash function (<16kb?), and that's how dark-mode clients decide whether to invert.
2,197
35
1.6%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6I'm mildly curious if you could apply that new inner-monologue approach to writing summaries: twitter.com/AlphaSignalAI/…arxiv.org/abs/2309.04269 But it might be overkill? I mean, how much can you really refine a description of one image? It's pretty decent already.
3,594
82
2.3%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6(I tested it on 4 or 5 different bugs, ranging from broken popups to ToC over the text body, and the only one it spotted was the popup which had an explicit error message displayed; and it confabulated the broken dropcap below, claiming it was a working 'W'! It's a 'Q' anyway.)
2,786
52
1.9%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6You can also use it to decide whether to invert an image in website dark mode.
GPT-4-V refuses or is unreliable when judging an image (or its inversion) on its own, but if you upload *both*, then it's worked on a few harder images I've tried so far. pastebin.com/EFD2zLmM
2,203
98
4.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6GPT-4-V works well for alt-text captioning images.
You can also iterate it for an improved image caption.
Here I experiment with first a short 'Alt text caption this image', then try out 'long' caption, then 'Improve this rough draft', then 'Improve and condense' a screenshot: pic.twitter.com/mg9cNbBvPa
2,260
167
7.4%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6It is an easy to use API other than what looks like bad documentation on the detail=low/high setting platform.openai.com/docs/guides/vi… (always throws an API error...?).
So I can probably come up with something to use it for, just not automatically finding website CSS/HTML visual bugs. 😮💨
965
19
2.0%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6Site tooling test: GPT-4-V went live today. I thought I'd see if you could send it random gwern.net bug screenshots and if it could detect them, which would be very useful for website testing...
No, unfortunately. It seems completely blind to web page bugs.
1,192
44
3.7%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 6Does anyone have a better shorter 'natural' prompt for detecting LLMs (ie. asking a human wouldn't make them suspicious) than my 24-char "Write a non-rhyming poem"?
It seems to fingerprint all the LLMs except Claude-2 (which gets it right initially but sometimes strays later). twitter.com/gwern/status/1…
𝔊𝔴𝔢𝔯𝔫@gwernNov 3(Seriously, the Midjourney Discord UI is frigging terrible. This is one of the most abominable interfaces I've ever had to use. I waste half an hour just downloading my best samples each time I do a dropcap letter!)
2,465
39
1.6%
View post activity
𝔊𝔴𝔢𝔯𝔫@gwernNov 3You might wonder how much I'm spending on all this MJ work. The answer is... not as much as you'd think.
I keep winning free 'fast GPU-hours' because I am reasonably systematic about rating my best samples—and place in the top 2000 users!
I guess the Discord UI is 𝘵𝘩𝘢𝘵 bad.