James Sheen

en.wikipedia.org

Flow (psychology) - Wikipedia

+

“‘It was like floating,’ 'I was carried on by the flow.’”

… suddenly I realised that I was no longer driving the car consciously. I was driving it by a kind of instinct, only I was in a different dimension. It was like I was in a tunnel.

This study further emphasized that flow is a state of effortless attention. In spite of the effortless attention and overall relaxation of the body, the performance of the pianist during the flow state improved.

…flow is associated with achievement…

−

…enjoyable activities that produce flow have a potentially negative effect: while they are capable of improving the quality of existence by creating order in the mind, they can become addictive, at which point the self becomes captive of a certain kind of order, and is then unwilling to cope with the ambiguities of life.

gatesnotes.com

AI is about to completely change how you use computers

In 5 years, agents will be able to give health care advice, tutor students, do your shopping, help workers be far more productive, and much more

Bill Gates

papers.ssrn.com

Sole Survivors: Solo Ventures Versus Founding Teams

A widespread scholarly and popular consensus suggests that new ventures perform better when launched by teams, rather than individuals. This view has become so

arxiv.org

Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models

We present Step-Back Prompting, a simple prompting technique that enables LLMs to do abstractions to derive high-level concepts and first principles from instances containing specific details. Using the concepts and principles to guide the reasoning steps, LLMs significantly improve their abilities in following a correct reasoning path towards the solution. We conduct experiments of Step-Back Prompting with PaLM-2L models and observe substantial performance gains on a wide range of challenging reasoning-intensive tasks including STEM, Knowledge QA, and Multi-Hop Reasoning. For instance, Step-Back Prompting improves PaLM-2L performance on MMLU Physics and Chemistry by 7% and 11%, TimeQA by 27%, and MuSiQue by 7%.

“The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.”
— Edsger W. Dijkstra

codertoentrepreneurs.substack.com

Meet a Programmer Who Turned an Open Source Tool Into a $7.5 billion Empire

Crazy Journey to Inspire Young Programmers

Sanjay Priyadarshi

twitter.com

Lilian Weng on Twitter

“Agent = LLM + memory + planning skills + tool use This is probably just a start of a new era :) https://t.co/Qtp6cHpz2Q”

This is probably just a start of a new era!

medium.com

Meta’s New LLaMa AI Model is a Gift to the World

Spearheading Change

Ignacio de Gregorio

“Open-source will eventually win the AI race.” — Yann LeCun

uxdesign.cc

Why do creative people need time to sit around and do nothing?

The dilemma of constant activity for creative minds

Elvis Hsiao

The power of downtime.

johnfgorman.medium.com

The 3 Most Important Truths in Life

Here’s what matters.

John Gorman

Truth #1: Your Health Will Fail You
Truth #2: You Will Run Out of Time
Truth #3: Everyone You Will Meet Will Leave You
Health, time, and people are the only three things in this world that cannot be retrieved once they’re gone.

arxiv.org

LongNet: Scaling Transformers to 1,000,000,000 Tokens

Scaling sequence length has become a critical demand in the era of large language models. However, existing methods struggle with either computational complexity or model expressivity, rendering the maximum sequence length restricted. To address this issue, we introduce LongNet, a Transformer variant that can scale sequence length to more than 1 billion tokens, without sacrificing the performance on shorter sequences. Specifically, we propose dilated attention, which expands the attentive field exponentially as the distance grows. LongNet has significant advantages: 1) it has a linear computation complexity and a logarithm dependency between any two tokens in a sequence; 2) it can be served as a distributed trainer for extremely long sequences; 3) its dilated attention is a drop-in replacement for standard attention, which can be seamlessly integrated with the existing Transformer-based optimization. Experiments results demonstrate that LongNet yields strong performance on both long-sequence modeling and general language tasks. Our work opens up new possibilities for modeling very long sequences, e.g., treating a whole corpus or even the entire Internet as a sequence.

arxiv.org

On the Expressivity Role of LayerNorm in Transformers' Attention

Layer Normalization (LayerNorm) is an inherent component in all Transformer-based models. In this paper, we show that LayerNorm is crucial to the expressivity of the multi-head attention layer that follows it. This is in contrast to the common belief that LayerNorm's only role is to normalize the activations during the forward pass, and their gradients during the backward pass. We consider a geometric interpretation of LayerNorm and show that it consists of two components: (a) projection of the input vectors to a $d-1$ space that is orthogonal to the $\left[1,1,...,1\right]$ vector, and (b) scaling of all vectors to the same norm of $\sqrt{d}$. We show that each of these components is important for the attention layer that follows it in Transformers: (a) projection allows the attention mechanism to create an attention query that attends to all keys equally, offloading the need to learn this operation by the attention; and (b) scaling allows each key to potentially receive the highest attention, and prevents keys from being "un-select-able". We show empirically that Transformers do indeed benefit from these properties of LayeNorm in general language modeling and even in computing simple functions such as "majority". Our code is available at https://github.com/tech-srl/layer_norm_expressivity_role .

oneusefulthing.org

What AI can do with a toolbox... Getting started with Code Interpreter

Democratizing data analysis with AI

Ethan Mollick

ai.facebook.com

The first AI model based on Yann LeCun’s vision for more human-like AI

I-JEPA learns by creating an internal model of the outside world, which compares abstract representations of images (rather than comparing the pixels themselves).

openai.com

Function calling and other API updates

We’re announcing updates including more steerable API models, function calling capabilities, longer context, and lower prices.

text-embedding-ada-002: reducing the cost by 75% to $0.0001 per 1K tokens.
gpt-3.5-turbo: reducing the cost of gpt-3.5-turbo’s input tokens by 25%. Developers can now use this model for just $0.0015 per 1K input tokens and $0.002 per 1K output tokens, which equates to roughly 700 pages per dollar.
gpt-3.5-turbo-16k: will be priced at $0.003 per 1K input tokens and $0.004 per 1K output tokens.

thenewslens.com

萬維鋼《佛畏系統》：達到「灰度認知，黑白決策」境界的人，才可以託付大事 - The News Lens 關鍵評論網

對於生活中的每一類問題，你都可以建立一個應對系統，用系統的方式解決。本書藉由豐富的日常案例，帶領你破除常見的思維盲點，分析「工作」「學習」「做事」「情感」和「社會」五大系統中的常見問題。

精選書摘

「凡夫畏果，菩薩畏因，佛畏系統」
已知機率大小的「風險」和無法評估機率大小的「不確定性」
「自由的探礦者」和「穩定的銅礦工」
「弱者報復，強者原諒，智者忽略」