- See Also
- Gwern
-
Links
- “Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine”, Zhang et al 2024
- “RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, Balaguer et al 2024
- “Inside the Chaos at OpenAI: Sam Altman’s Weekend of Shock and Drama Began a Year Ago, With the Release of ChatGPT”, Hao & Warzel 2023
- “Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation”, Ding et al 2023
- “Does GPT-4 Pass the Turing Test?”, Jones & Bergen 2023
- “PAIR: Jailbreaking Black Box Large Language Models in 20 Queries”, Chao et al 2023
- “Fine-Tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!”, Qi et al 2023
- “Non-Determinism in GPT-4 Is Caused by Sparse MoE”, 152334H 2023
- “Large Language Models As Superpositions of Cultural Perspectives”, Kovač et al 2023
- “AI Is a Lot of Work: As the Technology Becomes Ubiquitous, a Vast Tasker Underclass Is Emerging—And Not Going Anywhere”, Dzieza 2023
- “I’m Afraid I Can’t Do That: Predicting Prompt Refusal in Black-Box Generative Language Models”, Reuter & Schulze 2023
- “Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4”, Chang et al 2023
- “GPTs Are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models”, Eloundou et al 2023
- “Why Didn’t DeepMind Build GPT-3?”, Godwin 2023
- “OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, Konrad & Cai 2023
- “GPT-3 As Knowledge Worker: A Zero-Shot Evaluation of AI CPA Capabilities”, Bommarito et al 2023
- “Language Models Are Better Than Humans at Next-Token Prediction”, Shlegeris et al 2022
- “HALIE: Evaluating Human-Language Model Interaction”, Lee et al 2022
- “TruthfulQA: Measuring How Models Mimic Human Falsehoods”, Lin et al 2021
- “‘How GPT-3 Is Shaping Our AI Future’ With Sam Altman/Azeem Azhar (The Exponential View), Wednesday 7 October 2020”
- “Scaling Laws for Neural Language Models: Figure 15: Far beyond the Model Sizes We Study Empirically, We Find a Contradiction between Our Equations § Pg17”, Kaplan 2020 (page 17 org openai)
- “Towards Synthesizing Complex Programs from Input-Output Examples”, Chen et al 2017
- “Genetics of Caffeine Consumption and Responses to Caffeine”, Yang et al 2010
- “Why GPT-3 Matters”, Gao 2024
- “Greg Brockman: OpenAI and AGI”, Brockman 2024
- M74108556
- sharifshameem
- Miscellaneous
- Bibliography
See Also
Gwern
“The Scaling Hypothesis”, Gwern 2020
Links
“Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine”, Zhang et al 2024
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
“RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, Balaguer et al 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
“Inside the Chaos at OpenAI: Sam Altman’s Weekend of Shock and Drama Began a Year Ago, With the Release of ChatGPT”, Hao & Warzel 2023
“Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation”, Ding et al 2023
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation
“Does GPT-4 Pass the Turing Test?”, Jones & Bergen 2023
“PAIR: Jailbreaking Black Box Large Language Models in 20 Queries”, Chao et al 2023
PAIR: Jailbreaking Black Box Large Language Models in 20 Queries
“Fine-Tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!”, Qi et al 2023
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
“Non-Determinism in GPT-4 Is Caused by Sparse MoE”, 152334H 2023
“Large Language Models As Superpositions of Cultural Perspectives”, Kovač et al 2023
Large Language Models as Superpositions of Cultural Perspectives
“AI Is a Lot of Work: As the Technology Becomes Ubiquitous, a Vast Tasker Underclass Is Emerging—And Not Going Anywhere”, Dzieza 2023
“I’m Afraid I Can’t Do That: Predicting Prompt Refusal in Black-Box Generative Language Models”, Reuter & Schulze 2023
I’m Afraid I Can’t Do That: Predicting Prompt Refusal in Black-Box Generative Language Models
“Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4”, Chang et al 2023
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
“GPTs Are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models”, Eloundou et al 2023
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
“Why Didn’t DeepMind Build GPT-3?”, Godwin 2023
“OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, Konrad & Cai 2023
OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’
“GPT-3 As Knowledge Worker: A Zero-Shot Evaluation of AI CPA Capabilities”, Bommarito et al 2023
GPT-3 as Knowledge Worker: A Zero-Shot Evaluation of AI CPA Capabilities
“Language Models Are Better Than Humans at Next-Token Prediction”, Shlegeris et al 2022
Language models are better than humans at next-token prediction
“HALIE: Evaluating Human-Language Model Interaction”, Lee et al 2022
“TruthfulQA: Measuring How Models Mimic Human Falsehoods”, Lin et al 2021
“‘How GPT-3 Is Shaping Our AI Future’ With Sam Altman/Azeem Azhar (The Exponential View), Wednesday 7 October 2020”
“Scaling Laws for Neural Language Models: Figure 15: Far beyond the Model Sizes We Study Empirically, We Find a Contradiction between Our Equations § Pg17”, Kaplan 2020 (page 17 org openai)
“Towards Synthesizing Complex Programs from Input-Output Examples”, Chen et al 2017
Towards Synthesizing Complex Programs from Input-Output Examples
“Genetics of Caffeine Consumption and Responses to Caffeine”, Yang et al 2010
“Why GPT-3 Matters”, Gao 2024
“Greg Brockman: OpenAI and AGI”, Brockman 2024
M74108556
sharifshameem
Miscellaneous
-
/doc/ai/nn/transformer/gpt/3/2019-11-07-amodei-aiandcompute-twodistincteras-gpt3modified.jpg
: -
https://andrewmayne.com/2023/11/14/is-the-reversal-curse-real/
:View External Link:
https://andrewmayne.com/2023/11/14/is-the-reversal-curse-real/
-
https://barryzhang.substack.com/p/our-humble-attempt-at-fine-tuning
-
https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates
-
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4594466
:View External Link:
-
https://www.cerebras.net/blog/introducing-gigagpt-gpt-3-sized-models-in-565-lines-of-code
: -
https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-mode-collapse#pfHTedu4GKaWoxD5K
-
https://www.reddit.com/r/mlscaling/comments/146rgq2/chatgpt_is_running_quantized/
:
Bibliography
-
https://arxiv.org/abs/2401.08406#microsoft
: “RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, -
https://www.theatlantic.com/technology/archive/2023/11/sam-altman-open-ai-chatgpt-chaos/676050/
: “Inside the Chaos at OpenAI: Sam Altman’s Weekend of Shock and Drama Began a Year Ago, With the Release of ChatGPT”, -
https://arxiv.org/abs/2310.08419
: “PAIR: Jailbreaking Black Box Large Language Models in 20 Queries”, -
https://arxiv.org/abs/2310.03693
: “Fine-Tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!”, -
https://152334h.github.io/blog/non-determinism-in-gpt-4/
: “Non-Determinism in GPT-4 Is Caused by Sparse MoE”, -
https://arxiv.org/abs/2307.07870
: “Large Language Models As Superpositions of Cultural Perspectives”, -
https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots
: “AI Is a Lot of Work: As the Technology Becomes Ubiquitous, a Vast Tasker Underclass Is Emerging—And Not Going Anywhere”, -
https://arxiv.org/abs/2306.03423
: “I’m Afraid I Can’t Do That: Predicting Prompt Refusal in Black-Box Generative Language Models”, -
https://arxiv.org/abs/2303.10130
: “GPTs Are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models”, -
https://www.forbes.com/sites/alexkonrad/2023/02/03/exclusive-openai-sam-altman-chatgpt-agi-google-search/
: “OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, -
https://arxiv.org/abs/2301.04408
: “GPT-3 As Knowledge Worker: A Zero-Shot Evaluation of AI CPA Capabilities”, -
https://arxiv.org/abs/2109.07958
: “TruthfulQA: Measuring How Models Mimic Human Falsehoods”,