Marina Cortês

Ticks of time: on Cosmology, Everest, and Ballet

Generative models (AI): a selection of key developments — 2023

(Last updated: November 23rd, 2023)

Table of Contents

Why does a theoretical physicist start to chase and keep a close eye on AI technology in 2023?

This is a short introduction of why I started this page. I cite many sources but all mistakes are my own, and all errors are my own responsibility.

I am part of a discussion collective of intellectuals brought together by Silicon-Valley tech companies, jointly with companies in academic and scientific publishing. The discussion group is simply a town hall where members from very distinct academic and technological backgrounds can bring up, and discuss any developments in technology, science or industry.

This might include the discussion of key ongoing societal, or economical, transformations anywhere on the planet. Through this discussion group I was aware of the ChatGPT technology mainly since something like the Summer of 2022. However I did not pay proper attention until December 2022, when we held our first video meeting on the subject.

Roughly, by mid April 2023, and also prompted by Jaron’s New Yorker piece here below, I had learnt enough about the ongoing events, to become convinced that such developments merited devoted attention by academics. I decided to use my own expertise as a theoretical physicist, cosmologist, and data analyst, to look into the generative AI technology. I wanted to read the software, and the data being acquired and read, in order to be able to assess, for myself, the capability and behaviour of the upcoming technologies. I guess this is how a physicists thinks, when confusion abounds, in theory or data contexts, we like to boil it down to: “What do we actually know for a fact?

Our priority in these discussion groups, in which this knowledge is circulated, has always been to not generate unwarranted frenzied panic. However, it also quickly became clear that wide-audience education on the rapidly advancing AI technology would play a big role in a smooth transition to a new society that we are about to witness / are already witnessing.

Below is a step by step itemisation of the developments ordered by date. They contain the sources I found the most useful, as 2023 unfolded and development of the generative (large language) models were rolled out.

I have to thank a team of tireless, devoted, and engaged colleagues and friends, from all corners of society, who were vital in providing much of the content I was privileged to have access to, in real time. All opinions and mistakes are my own.

January 20th, 2023

January 20th, 2023 — Google Layoffs include many voices of Open Source software, including Chris DiBona (Director and Founder of Google’s Open Source Program Office, OSPS), Cat Allman (Google Open Source Program manager), Mike Frumkin (Founder of Google’s Accelerated Science Team).

  • `Firing the best of the best
    `Google has laid off many leading lights of the open source world. This will have a profound effect on software supply chain security.’

The layoffs happened suddenly, announced by email, without any notice, and included some of the strongest, most brilliant individuals in the company. I received this news very seriously as did many of us in these circles. Most unsettling was the realisation that some of the most talented minds, DiBona, Frumkin, Allman—that have worked inside big tech towards just and open technological societies for decades—could lose their positions as gatekeepers of software security and trustworthiness in the very groups they had founded.

Personally this was the first alert that a new, significant, and disquieting change inside big tech, might be about to happen, and most likely already underway. Knowing personally the quality of the individuals in question gave me a good grasp on the magnitude of power that was being called upon in order to carry through these layoffs and implement and bring forth the new technologies.

January 20th, 2023 marks for me a turning point in realising that a change in trajectory of the planet was taking place. I remain hopeful that we will be able to reverse this trajectory, as societies and as a whole.

March 9th, 2023 – The AI Dilemma – Harris & Raskin

The AI Dilemma“: Center for Humane Technology, March 9th, 2023, with Tristan Harris and Aza Raskin.
This video contains the most up-to-date source of information (until August 6th, 2023–scroll below for more info on “ The Otter moment”). This is also in my view the most authoritative and well prepared content. Harris and Raskin have on March 9th, already anticipated many of the developments that have taken place subsequently– at least 5 months ahead of their time.
Key moments:
1. 2023 is the year when all content-based identification will break down. We might be moving towards a future in which we only trust those with which we interact personally, or in direct line of sight.
2. The technology is in accelerated development. We do not currently have a means for measuring the rate of acceleration.
3. Extractive technologies.

Here are the citations mentioned in The AI Dilemma:

  1. 2022 Expert Survey on Progress in AI:
  2. Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding:
  3. High-resolution image reconstruction with latent diffusion models from human brain activity:
  4. Semantic reconstruction of continuous language from non-invasive brain recordings:
  5. Sit Up Straight: Wi-Fi Signals Can Be Used to Detect Your Body Position:
  6. They thought loved ones were calling for help. It was an AI scam:
  7. Theory of Mind Emerges in Artificial Intelligence:
  8. Emergent Abilities of Large Language Models:
  9. Is GPT-3 all you need for low-data discovery in chemistry?
  10. LLMs can self improve—Article:
  11. Forecasting: AI solving competition-level mathematics with 80%+ accuracy:
  12. ChatGPT reaching 100M users compared with other major tech companies:
  13. Snap:
  14. Percent of large-scale AI results coming from academia:
  15. How Satya Nadella describes the pace at which the company is releasing AI:
  16. The Day After film:
  17. China’s view on chatbots:
  18. Facebook’s LLM leaks online:

March 29th, 2023

Emerging Tech Trend Report, 2023, Amy Webb at SXSW 2023, March 29th, 2023.
“Futurist Amy Webb, CEO of the Future Today Institute and professor at NYU Stern School of Business, provides a data-driven analysis for emerging tech trends, and shows perspective-changing scenarios for the future.”

April 20th, 2023 – Jaron Lanier at the New Yorker

There is no AI“, Jaron Lanier (founding father of Virtual Reality) April 20th, 2023, New Yorker Article.

April 26th, 2023 – NVIDIA’s Risk-based testing of AI systems

Risk-based methodology for deriving scenarios for testing artificial intelligence systems” by Barnaby Simkin – NVIDIA – April 2023
Best derivation of principles for AI regulation I have seen till today (September, 5th, 2023), as far as a concrete strategy and clear outline are concerned.

My own opinion is that the interests of industry’s stakeholders need to be taken into account as well as the civil society’s interests and policy makers’ interests.

I believe we need to bear everyone’s interests in mind, during regulation, otherwise cooperation will not succeed. We must not overlook the financial interests of those we are trying to regulate or they will side step and assign the whole regulatory enterprise to legal limbo.

May 5th, 2023 – Jaron Lanier, Unherd interview

Amongst other key statements, Lanier explains he has an understanding with Microsoft that grants him academic freedom to speak his mind, while at the same time, not representing the company’s views.

How humanity can defeat AI“, Jaron Lanier, interview by UnHerd channel, May 5th, 2023.
Key moments:
1. Lanier states that despite being employed by Microsoft he has an agreement with the company in which he is free to speak his mind, but also does not speak for Microsoft. He enjoys academic freedom, with regards to the technologies under discussion.

2. Jaron states that the mathematics behind the large language models is “embarrassingly simple”. This is essentially the product rule of likelihoods (used in basic statistics) and is confirmed by Perimeter Institute’s Roger Melko’s lecture in May 2023, posted below.
The complex behaviour of the language models is a sign of the large number of free parameters to be fitted—the long file of order 10^{12} weights, obtained when the models are trained— as well as some clever ways to interconnect those degrees of freedom.

May 8th-10th, 2023

Perimeter Institute for Theoretical PhysicsRoger Melko – computer science – May 2023.
Melko, R. (2023). LECTURE: Generative Modelling.
3 days of lectures of technical derivation of the mathematical machinery behind large language modelling (generative AI). Good introduction for physicists and data analysts.

May 8th, 2023.
DOI : 10.48660/23050140
40mins – in Large Language models the bottle neck is the training cost. GPT3 cost USD 20 million to train. GPT4 (not disclosed) possibly USD 100 million cost to train.
42mins – Roger agrees that the number of parameters in learning technologies must not exceed the quantum gravitational cap on information stated by the Black Hole entropy result.

May 9th, 2023.
DOI 10.48660/23050097

May 10th, 2023.
DOI 10.48660/23050095
Min 38: Melko starts to explains the architecture mathematics behind LLMs
Min 39: Gives the mathematical rule for the joint distribution estimator of the data vector, v, (visible units). The estimator maps the visible units of the data vector to a sequence, using the chain rule of probabilities. This portrays the autoregressive property of the models, and is one of the most powerful properties of LLMS.

Min 40: Sec 31 — Using a Stars Wars example, “ May the force be with you” Melko provides the simplest explanation for the behaviour of the functioning of the LLM I have seen to today. LLMs are probabilistic reasoning.

Melko describes how the technology involved in large language pre-trained models is a predictive text technology, which operates on a word-by-word basis. In particular Melko explains that LLMs are overparametrized / under-fitted. This means that the number of free-parameters in a fitting model is larger than the number of parameters that the available data requires to be fit to, or explained. In a typical pre-trained model there is not enough (natural/digital) data to estimate the likelihood function of the data distribution by usual MCMC methodology. As a result the statistical rule used to calculate the likelihood of a given data vector is the chain rule of products.

May 24th, 2023 – White House announces AI Risk Framework by NIST

White House “AI risk management Framework”, National Institute of Standards and Technology (NIST) by US, Department of Commerce. White House release of the 2023 updated National AI R&D Strategic Plan. It does sound a little simplistic particularly when compared with (NVIDIA’s) Barnaby Simkin’s several-stage risk-assessment framework, released in a IEEE zoom above, one month before this White House release.

June 1st, 2023 – Mo Gawdat on AI risk

Mo Gawdat expresses the view that generative AI technologies might now be more threatening to the planet than climate change.

AI risk, a view from ex-googler Mo Gawdat, June 1st, 2023.
Business Insider article.

June 7th, 2023 – Center for Humane Technology – Status report

Randy Fernando (Co-Founder, Center for Humane Technology)

Randy Fernando, Center for Humane Technology, AI Town Hall, June 7th, 2023. (An update to the earlier Harris-Raskin presentation above).

June 16th, 2023 – Meta announces new VoiceBox model too risky to release

Meta VoiceBox `too risky to release’, June 16th, 2023:
press release,
research post,
academic research article (Facebook Research).

June 24th, 2023

DisrupTV-327 discussion panel, June 24th, 2023
David Bray (Distinguished Fellow – Stimson Center and Business Executives for National Security)
Divya Chander (Anesthesiologist, Neuroscientist, and Data Scientist)
Megan Palmer (Senior Director for Public Impact at Ginkgo Bioworks and Adjunct Professor of Bioengineering at Stanford University).
Very informative seminar content-wise, and interesting discussion on AI applications, Synthetic biology and its applications in health and medicine and in adjacent fields.

Key points:
* Chile has just passed a “neural bill of rights” issuing citizens with the right to choose what data gets uploaded into and downloaded from their brains.
* Since there is a shortage data to train large models, we can start using data outside of the human species and take advantage of data generated by the biosphere.

I finally managed to complete the summary of this key discussion above:

14mins: Emphasize that large language models are predictive text engines they don’t have *knowledge* of facts versus lies. They are just simply filling in text based on the data they were given. (A little bit more advanced than Ouija boards).

15mins: We need a immune system for the planet: the equivalent of smoke-detectors for the biological space. 

Like biological sensors: in the 1900’s we had a problem of buildings that could catch on fire, and  hurt people. The answer was private companies built smoke detectors that alert if there is smoke in the building and call the appropriate fire department. 

17mins We will get to an era where we will need alert systems to signal if something is in the building where it shouldn’t be. What do we do, what is the alert notification and who’s responding.

17:10 Like all technologies this has a tremendous force for good, but we also need to be ready for when it might be used for less good purposes. We’re definitely going to see that.

17:30 There are some real world application right now. Synthetic biology is giving base chemicals that are the building blocks for everything: new cells, enzymes, diamines. These are the Lego blocks. There are positive disruptions like using metabolic engineering and genome editing to provide food, energy, medicine, preventing pandemics, and even adjusting climate change. 

19mins Biotechnology is already invented itself and we’re  drawing from its toolbox. New tools of synthetic biology are allowing us to decouple biological design and biological  engineering. That means we can organize companies in new ways: instead of just industrializing biology we biologize Industry. A new footprint for industry that is based in community and enabling everyone everywhere to harness capabilities of the technology. But we also need to make sure that we bake in the strategies for safety, social responsibility, security, and sustainability into that type of footprint in synthetic biology.

22:40 All machinery that biology uses inherently does storage better than anything technology invented by humans. It manufactures on its own and it doesn’t require a lot of energy to do it it’s low power, high efficiency, has low error rates or wouldn’t be able to pass on our DNA from generation to generation, and it’s able to play nice with other biological systems.

23:20 Most of the ways we will achieve longevity is by dealing with the chronic disease that we have. We can use gene editing to modify pathways that result in cancer. 

24:10 Solve a disease process at the level of the embryo, the child doesn’t grow up to express that disease.  If you edit the sperm line of the embryo that results that gets passed on downstream. 

24:30 And everything has dual use. Think about the possibility of engineering our race for desired traits. That can create a race of haves and have-nots. This needs thinking when we’re democratizing the technology. There are ethical, social, regulatory issues. 

25:40 Use AI for discovery and design in synthetic biology, and then print out beautiful 3D structures. 

26:10 We need technologists, regulators, the government, and business people working together to create data stewardship and protections. There are ways to use bar coding, organisms that identify who did what, if there was a third party actor, that might not have the best intentionality nor authentication. 

28:30 The ethics we need to think going forward for biological data are similar to the ethics we need to think about for AI systems, and are also similar to data from low-orbit satellite observations from space: do you want to have somebody seeing what you’re doing from space.

29mins: In 2017 UK proposed the Data Trust: data cooperatives where people say I give permission for my data *collectively* to be used in a specific fashion. In the AI space we don’t have this. We don’t know which data did OpenAI use to train its models. 

30mins: We need a Global Community Commons for Biological Data

32: there’s so much more biological data out there that we haven’t discovered, there is human DNA, but that’s only one species, there’s a whole toolbox out there yet to be discovered, can be part of the Community Commons of data.

36mins: What is the very basis of what it means to be human. There are Data Trust models but we are also data producers. We leave our genetic material everywhere we go, we exhale in our breath, our skin changes color, we have a neural code. 

36:40 We can take that to the adjacent place of neural rights and neural sovereignty. There are systems of informed consent. Chile is the first country in the world that passed a bill of neural rights as part of the Constitution. It is an idea of informed consent gaining momentum: if you are reading from my brain or write to my brain you need to do so with my full consent.

37:30 How to develop neuro-technology in the consumer space?

First: Propose mechanism where people can transact with their own data and have agency at the edge

Second: You can consent to your data being used but can revoke consent.

Third: Turning your data into a human right.

38mins: We are now submitting to the United Nations, who are concerned with neural rights. Redefine that the data coming from your brain is a human right. This can speak of freedom from manipulation because the technology can manipulate what you do and how you act. These rights in synthetic biology are adjacent to AI.

39mins: I have the right to be disconnected and not to be seen as a terrorist. Could I even disconnect and transact in this world? With cash which is the only anonymous form of value exchange. Such issues will have bigger and bigger policy implications and possibilities for manipulation.   

40mins: We have six or seven different technological revolutions happening in parallel all of which that vague ethical and society questions. It is entirely unclear who is going to set the standards. 

40:20 There are 54 different AI policies at the moment for 54 different countries, with no coordination amongst them (data from JP Singh at Institute for Sustainable Earth at George Mason University). And it is only going to get worse. 

40:30 How does Salesforce feel like they have to navigate 54 different policies for AI?

Smaller countries that don’t have legacy burdens are legislating faster (unlike the EU and US). These are  leapfrogging directly ahead. We are surprised when countries one would never expecte actually show a Better Way in doing AI than the ones you would expect.

42mins: Critical in the field of regulation is the biological weapons convention. The digital data is very cross jurisdictional. Companies might want to get a UL listing that protect the digital data and how it’s being sourced and extracted (and ingested). 

[UL Listed means the product meets nationally recognized standards for sustainability and safety.]

44mins: You can get a stamp of approval that certifies the company has a sustainably sourced data company. The data company earns the trust of consumers and in the same way a fiduciary responsibility. 

46mins: McKinsey Global report on the value of the synthetic biology and AI market of order trillion US$. There are ongoing efforts by the US government, NIST, the Technologies Department of Commerce and others to develop better metrics to estimate the current value of the bio economy. 

A significant leap forward on estimating the value of the market on this industry was the announcement of an executive order on biotechnology and biomanufacturing, and legislation in chips and science that is enabling the coordination of activities and also accounting for the growth over time. Now we have loose metrics now and we’ll have much better ones in the very near future.

47mins: The human body is a reservoir of so much data, and there is so much efficiency in its storage. Today the scale is exabytes maybe Yotabyte, $10^{24}$ in the future. But we’re at a point where it’s almost  $10^{99}$, that’s what the human body’s doing, with efficient storage. Nature provides a lot of great models for computer scientists to see how living systems end up being much more efficient with energy consumption, and with how we learn. It’s all sitting right there and it’s fascinating. 

If we imagine it’s 2033 what are the hopes for synthetic biology?

Biomimicry of materials (spider silk, self-healing skin), 3D organoids, stem cells, neuromorphic computing and nothing in computing has come close to what the human brain can compute with the power of about a sigh bulb. The architecture of our brain and the way it deals with noise is beautiful. 

Lastly upcoming advancement in editing the genome: being able to switch the epigenome “on” and “off” to enhance our ability to survive and coexist and cohabit in an environment. Turn features up and down to enable that survival. 

53mins: I hope that in ten years we have a science of systems, in which we understand how different systems layer on top of each other. We can amongst others understand how the variety of systems on the planet connect. The good news is we’re eight billion people, the challenging news is we’re eight billion people. I wish for a science of systems that understands and moves out of the Maslow hierarchy of how things correlate and actually be predictive. That would be opposed to economics, which game theory shows is correct only 30% of the time (meaning economics is wrong 70% of the time). 

55mins: This industry is going to impact our lives for the next century, it is probably the innovation of the century. And more importantly we need a plan! 

June 29th, 2023 – new imagery modalities for large models

AI development on filmmaking and acting, June 29th, 2023, The capabilities of the technology seem to have reached, or are about to reach seamless human-actor, voice, imagery, and footage, together with the insertion onto contemporary feature-length movies, as well as past ones.

July 22nd, 2023 – Jaron Lanier’s New Yorker piece: reality cannot be replicated in the digital world.

What my musical instruments have taught me”, Jaron Lanier, New Yorker, July 22nd, 2023. In this article Lanier states that reality is incompressible. In my view this implies that AGI is not likely to be achieved.
(Nov 21st, 2023 note: I will have an academic statement of this shortly, in the next revision of my latest article on energetic causal sets, (with Vasco Gomes and Andrew R. Liddle) my quantum gravity model proposed here (with Lee Smolin).

If you work with virtual reality, you end up wondering what reality is in the first place. Over the years, I’ve toyed with one possible definition of reality: it’s the thing that can’t be perfectly simulated, because it can’t be measured to completion. Digital information can be perfectly measured, because that is its very definition. This makes it unreal. But reality is irrepressible.”

I wrote this post on Jaron’s article (August 2023) related to art and AI around the same time. Here is an earlier art and AI point of view from April 2023.

August 6th, 2023 – “The Otter moment” Zoom’s terms and conditions updates

At the beginning of August 2023 a new turn of events would change the scenario of the extent of reach of transformative AI technologies. Namely online meeting platforms changed their terms and conditions to allow for freer inclusion of generative AI software in collaborative meetings.

Around August 6th, 2023 the online-meeting platform “Zoom Video Communications, Inc.” updated their terms and conditions. Amongst other new Zoom features this update enabled the inclusion and widespread dissemination of a collaborative technology known as “Otter AI“. Otter AI is a note-taking piece of software for usage in online collaboration (based in Mountainview, CA). The implications of this particular technology, particularly in what regards, its design, and default settings, have consequences, that as of now, I do not see we could have anticipated with the content that Harris and Raskin shared.
On August 7th, 2023, the platform representatives denied that this update of terms allowed for third-party model training on data content owned by the Zoom application.

Some background context regarding Otter AI and its history of interaction with the Zoom corporation (quoting from the platform’s website):

The following changes have gone into effect on September 27th, 2022 for the Otter Basic plan.

OtterPilot will be included in the Otter Basic plan. Users will be able to have their OtterPilot automatically join meetings for Zoom, Microsoft Teams, and Google Meet to automatically record and transcribe in real-time. Users can easily access their notes, even if they can’t join the meeting. Learn more about OtterPilot.

August 27th, 2023 –New modalities: Text to-Video

August 27th, 2023 — New-Text to-Video developments by One Prompt
Do pay close attention to the pace of development in the technology. This capacity was already hinted at by Emad Mostaque, of Stable Diffusion a few months ago in late March 2023, I paste the link of the discussion with Emad Mostaque at the conference Abundance 360 here in the next section, since I only saw it recently, in mid August.

March 20th, 2023 – Stability AI’s Emad Mostaque
(I only became aware of this content in August)

This clear and concise addressment of Emad Mostaque’s is mind-boggling. The fact that he is speaking in such an incisive, cogent manner as early as March 2023, is to me a stark indication of the disconnectedness of communities trying to follow the technology. The latter is one of the main worries of those of us engaged in keeping up with developments.

Stability AI’s Emad Mostaque at Abundance 360 March 20th-23rd, 2023 at the Abundance360 (A360) conference.
Emad Mostaque appears to be quite spot on and very sharp with content.

He gave this interview at the end of March, I highlight a few statement in bullet points here below, I did not get through the entire talk.

I was very surprised by his views held at the end of March this year, which are far ahead anything I was thinking at that time.

  • 4mins
    The mission of stability AI is to provide the building blocks to a society OS
    National models for each country
  • 5mins:40secs 
    Stability AI is built as an open platform which is different from other AI companies (
  • 6mins:30secs 
    Transformers learns what are the important communication items.
  • 8mins:30secs
    I can’t see the future past 5 years
  • 8mins:50secs
    At the end of 2024 or 2025 everyone will have chatGPT downloaded on their smartphone without internet.
  • 9mins
    Optimization of “amount of effort vs amount of communication” of information in human culture
  • 10mins:40secs 
    “Intelligence is compression’’
    130 TB compressed into 2 GB: a single file of weights that can create anything. 
  • 12m:40s 
    “This is the most disruptive technology ever’’.
  • 13mins:45secs 
    “There will be no programmers in 5 years.’’
    (In 5 years time there will be no such job as software writing (coding)

  • 16mins: 
    Anyone in their twenties should drop what they are doing right now and learn to use the new technologies available. (Throw themselves 100% into learning the new tools.) 

    “It is the biggest change in society ever’’.
    “In one or two years this technology will be as disruptive as the pandemic.’’

  • 21mins 
    Stability AI is formed (was formed at the time) only of 150 employees. Given the open source policy internal progress is made faster and the company needs less work force.

  • 21mins 30 sec
    Already at the end of March Emad was called by school Head masters in the UK to ask what should their generative AI strategy be. Emad’s advice was to “end homework’’ and the assignments should be taken in the classroom presencially. Eton (UK) does essays live by hand. 

September 15th, 2023 – Lanier on AI inversion and Output Provenance

Jaron Lanier, who personally I believe to be a visionary delivers a lecture to students at UC Berkeley

The messages in Lanier’s seminar might be the most lucid and sober content on the AI context I have heard in months.

In this lecture Lanier explains how the assignment of human characteristics to a group of software models has historically been documented as potentially leading to nefarious consequences for the organisation of societies.

He cites Norbert Wiener was a opponent of Marvin Minsky, a pioneer of robotic and AI at MIT. Wiener wrote a book called `The human use of human beings‘ in the post-war period, about this phenomenon. He thought that if we personify machines of this kind too much you might end up with some people exploring other people. At the end of `The human use of human beings‘ there is a thought experiment:

“What if someday there could be a small device that was connected by radio signals to a big cybernetic device, which these we might call a Large Model, or a neural network or something, and what if that device had information about you, and was following you, and entered into a feedback loop, that might manipulate you for the benefit of whoever owns the central device? That would be the end of civilisation, it would just make us insane!”
Obviously we built that thing!

In regions of parameter space where the dataset is sparse, there is not much antecedent data, and output might be bizarre. Wouldn’t it be great to say there’s a commercial opportunity, add dat of this kind to the big model. You can make some money, you can earn some glory, some recognition, people will know you did it. But now, there is no way to do it, there’s this sense of this creature producing outputs, the AI, and we don’t know how it does it. Why should we keep this mystic?

When people grow up on science fiction stories, it become their vocabulary. And so just again and again “I have to create sky net. Or I have to create those agents in the Matrix movies“, or whatever it is. But there is no reason to think about it that way.

More people should define their lives as creative lives, rather than lives driven by a narrow necessity
of one sort or another, which is what we have today. What is the better idea? What do we want from all this technology?

If ChatGPT could be considered a social collaboration, instead of a mysterious creature, if you were your inverted way of looking at it, how do we reckon with the fact that most of the training data is heavily white and male and western?
By making the training data explicated!
What should happen is when you get a result, an output of the system, is that you should be able to get a characterization of the key antecedent examples that influenced your result.

The problem when we give the impression that GPT is an oracle, this mysterious, infinitely large oracle–that has a trackless interior that no one can interpret–is that, when you then complain about bias, your only option is to try to slap another AI on the output to try to catch the bias, which gets you back into the genie problem.

If we pretend that the antecedent data is some trackless, impossible to know, gigantic mush, and we can’t even talk about it and all we can do is try to moderate it on the output.
We’re putting ourselves in a needlessly difficult position. Why can’t we be motivated to make the training data work data work better for society?

43mins 51secs
I don’t know if the US will be around in a couple years. I’ve never felt that way before in my life. It’s horrible and it’s happening all over the place. This is something I’ve written a lot about and when I started writing about it, nobody believed me and now it’s normal to worry about this.
At any rate, what we’re all terrified of now in the industry and everywhere is that if the current generation of generative AI is used to make people insane, like for upcoming elections we’re really in a tight spot.

We tech people make ourselves insane with our own inventions. Twitter made Elon Musk insane in a way that he didn’t use to be insane.

And so we have a big problem, we’re gonna go through harrowing times in the next few years. I believe if we can make it through the next few years, we can make it through a long time after that. At least as far as these issues go, then we still have the climate and everything. But I think we’re about to, we’re gonna go through a very difficult crunch time here.

UC Berkeley could accept some funding to do provenance research in AI, be the people who look at the technical feasibility.

September 21st, 2023 – Harris on `Beyond the AI Dilemma’

(Center for Humane Technology (CHT) Tristan Harris’ addressment (that I know of) at CogX Festival last month, September 12th-14th, in London, UK.

I have now watched this video and am utterly baffled at how Tristan Harris’ content on this presentation is essentially the same as in the AI dilemma presentation of March 9th, 2023 (read above). In September 2023, the face of the AI planet is unrecognisable from what it was on March 9th. We are now six months onwards which, given the development pace of this technology, might as well be five years later. Most of the developments that Harris mentions in this presentation at CogX are not essentially pre-historical.

I have not been able to understand why the quality of Center for Humane Technology’s presentation here does not comport the high standards that we have learnt to expect from CHT. Hopefully this will come to light.

Mid October 2023 – Exhaustion and well-being

Tristan Harris above describes how the Center for Humane the story started in January/February this year, 2023, when they started to receive calls at the Center for Humane Technology. This is the same time January 20th to be precise (see Google Layoffs at the beginning of this page) that myself, and a few others, became aware that there was an ongoing issue that would attention and `horizon scanning’.

Ten months onward we all are quite exhausted with the burden of knowledge, of the incessant pace of the AI technology. (I am writing this on October 26th, 2023.) Adding multiple zones of conflict scattered around the planet, on top of the ongoing AI race, is proving quite a challenge for our human minds. We are only human after all. I do hope, as I write this, that so many colleagues and friends, that have inspired me, shared their knowledge, and resisted quite a few of the technology standoffs plus other state of affairs, will get a rest, go for a walk, sit in nature, hug our families. Someone I admire a lot, Brian Germain at Adventure Wisdom said
A hungry farmer feeds no one.”

Brian Germain is an author, teacher, entrepreneur, inventor, test pilot, psychology researcher, keynote speaker, and world champion skydiver. He was featured in this article for example. Let me use a skydiving expression to sign off for today: “Blue skies

October 12th, 2023 State of AI report

One more of the massive efforts of AI Explained on processing a very large volume of information (163 page report) and summarising it in witty informative videos, available to all.

At this point quite honestly, I can no longer process so much information, while trying to make it available to others as well. I had to pause watching this after a couple of minutes, because of the words, `explosion of modalities’. I hope AI Explained forgives me but I could not help to suggest that a new generation of models arrived…

Meet the LEMMs. Large Explosion of Modalities Models. Click here.
I guess the Lunar Excursion Model (LEM) is now outdated. I think I would mind this particular state of affairs if I were Neil Armstrong. Or maybe only us geeks mind.
Views of the Lunar Module LM-2 (A19711598000) as it appears in the National Air and Space Museum’s updated Milestones of Flight Gallery (Gallery 100) in Washington, DC

There is also the reference to real-time brain scans. (Real time thought processing). I am not sure if someone who is not a physicist will get this, but I thought immediately of tinfoil hats from Signs (2002). [Come on, we also have to have some fun, and we have been doing this, analysing these technologies for the best part of one year now.]
Not everyone got my tinfoil meaning so I will write it down here. The idea is that non invasive brain scans like the one invoked in the State of AI report video (and this is just the first *two* minutes) and also available on our very own arXiv, ultimately proceeds through good old-fashioned electromagnetic radiation a la James Clerk Maxwell.

Signs (2002)

I have not conducted any extensive studies but (hypothetically speaking) the two immediate ways that come to mind that would shield one’s brain from electromagnetic radiation are tinfoil or a microwave oven. Plus the latter is a bit dodgy for reasons that can be clarified involving quantum information. (If you happen to be a MSc Physics student looking for a project do get in touch.)

I think the fact that no single physicist understands what information really is, including myself at the frontline of ignorance, is something that we really like to keep for ourselves. That might mean this paragraph will disappear soon, or not!

October 19th, 2023 “The Future of Data, Bio, and Algorithms” 

The following was outlined in discussion at the October 19th meeting 
“The Future of Data, Bio, and Algorithms” 
The recording is accessible here.
There is an ongoing effort on the side of regulation actors (across US and EU, I believe) to delay the drawing up of the EU AI act, until the European parliament elections in June 2024. 
The current EU parliament is tied up in tension between a tech-based regulation orientation versus citizen-based regulation. The former is preferred by the member states whereas the latter is preferred by the EU parliament.
Since the member states prefer tech-based (meaning put the interests of the industry first) if the AI act can be delayed, hoping that the new EU parliament elected in June will defend citizens less, then the interests of industry can be fulfilled. 
It is also argued herein that, historically US regulation (like privacy rights) tends to follow regulation in the format that the European Union formulates. 
So, in this discussion below, the stance being presented was the US watching the EU waiting for this “act”.

I was only able to attend for a very brief 20 minutes of a very smart discussion between 

Marc Martin, Partner, Perkins Coie,
Moderator Cass Matthews, MICROSOFT Office of Responsible AI
Benoit Barre, Partner, Le 16 Law
, European Union—session-four
The very smart moderator, Marc Martin elicits good answers from both the EU participant (Benoit Barre) and the US participant (Cass Matthews).

On a personal note, from what I have seen in the technology, extreme care is due already in “general purpose AI‘’ meaning narrow AI. The systems are taking advantage of vulnerabilities in society that we are simply not prepared for. 

I do not subscribe to terminology like super-human or under-human. 

I think it is better to discuss of the technology like a very serious weapon. It is not human, nor is it alive, nor will come alive, but it takes advantages of vulnerabilities we did not even know we had. 

Globally speaking we need to approach and regulate this technology as we approach and regulate nuclear technology.

In short, I would oppose anything ongoing in the European Union (or elsewhere) that remotely refers to, or implies a stalling of regulation. Abstract, philosophical discussions that make us more confused are reaching this very goal. 

I do not trust philosophers at the AI regulation chair. This is not an academic debate.

October 27th — ChatGPT bot that can talk

October 30th, 2023
Coordinated US-UK release of new orders on AI safety

President Biden’s and vice-president Kamala Harris LIVE announcement from the White House

October 30th, 2023 — Part 2 Executive Orders issued by the White House

Agencies get marching orders as White House issues AI-safety directive. The National Institute of Standards and Technology is ordered to draft red-teaming requirements, the National Science Foundation to work on cryptography, and the Homeland Security Department to apply them to critical infrastructure.

October 30th, 2023 — Part 3 Blueprint for AI bill of rights

A blueprint for AI bill of rights was issued by the White House Office of Science and Technology Policy (OSTP) “MAKING AUTOMATED SYSTEMS WORK FOR THE AMERICAN PEOPLE”

October 31st — AI Summit Talks on Eve of AI Safety Summit at Bletchley, UK

Discussion held at Wilton Hall, right outside of the famous Bletchley Park on the eve of the AI Safety Summit, moderated by David Wood, chair of the London Futurists..
By the Youtube Channel Existential Risk Observatory

Statements therein:
Stuart Russell’s
“Truth decay”
“The AI revolution is worth quintillion dollars” (USD 10^18) -> needs double-checked.

(What we have been slow to cotton on is the amount of money involved in this industry in the upcoming future. Since January we have asked ourselves multiple times, how much profit could be involved to justify such disruption. In January 2023 I speculated that the only possible reason to motivate such proliferation of developments would be a change of the level of our civilisation as characterised by the Kardashev Scale of level of technological advancement based on the amount of energy it is capable of using. )

“Policy software is impossible” around 26mins in
“European AI act – has a hard ban on the impersonation of human beings, you have the right to know if you are interacting with a human being. Easiest, lowest hanging fruit that every jurisdiction in the world can implement immediately.” Around 27 mins in
“Opt-out built in, a kill-switch button, remotely operable and non-removable. This is a technological requirement on open source systems, if you make a copy of software the kill switch needs to be copied as well. Implies more regulatory controls on open source system than on closed source.”. Around 28mins
“Red lines. We do not know how to define safety, but we can scoop out obvious forms of harm:
Self-replication of computer systems or hacking into other systems is unacceptable”. Around 29mins

Special part: they ask “When are all the smart people in the world going to quit what they are doing and start working on this?” Around 31mins

“Nuclear chain reaction. How to keep the reaction subcritical, and from going super-critical and becoming a bomb. A mechanism with negative feedback control system with moderators to keep the reaction subcritical”. Around 34mins

“AI should not be politicised. Bipartisan agreements in place in the US, might be failing in the UK. The political message should be uniform: about being on the side of humans or AI overlords. Raise awareness but not in a partisan way.” Around 36mins

Andrea Mio
“Very powerful big tech companies who have extreme lobbying power and control over governments” 51 mins

Mark Brakel,
54mins China has not been invited, but it might still be invited to attend the AI safety Summit, aiming at making this an inclusive summit in that nations of the world will get a seat.
Base the Summit on examples of Large scale AI harm that we have already seen such as the Australian robot death scandal, or the Dutch benefit scandal.
Least optimist on the role of big tech companies present at the summit. Responsible scaling might be being used as an excuse by many companies as an excuse to keep going.

Max Tegmark (MIT) Get the point of view from civil society and academic groups do not profit from AI technology 1h09mins

Stuart Russell, (UC Berkeley)
Max Tegmark (MIT), theoretical physicist, Future of Life Institute
Andrea Miotti,
Jaan Tallinn, co-founder of CSER, (co-founder of Skype)
Annika Brack, The International Center for Future Generations
Hal Hodson, journalist, (astrophysicist)
Ron Roozendaal, deputy DG on Digitalisation for the Dutch Ministry of Interior and Kingdom Relations,
Mark Brakel, Director of Policy at Future of Life Institute,
Alexandra Mousavizadeh, (economist)
More info:

November 2nd, 2023 – AI Safety Summit

Nov 2nd, 2023 — AI Explained summary of the summit in Bletchley UK

Around 5mins Integrate all modalities in a single interface.
Around 6mins Regulate actions, not outcomes.
Scaling policies: What is your scaling limit? Asked of Big Tech companies.
NVIDIA‘s CEO: Increasingly greater agency is only sensible with human-in-the-loop pipelines. Ability for AI to self-learn, improve and change out in the wild, in a digital form is unthinkable. #humanintheloop
The word of the day is to red team the technology, employ world experts to find security vulnerabilities in the software.
Commitment from Anthropic: If they found that any future models pose cyber security, biological weaponry or nuclear risk, then they commit to not deploying that nor scaling, until the model *never* produces such information, even when red teamed by AI world experts, or using prompt engineering when special techniques designed to elicit the worst behaviour.

Around 13mins: Representation engineering: A Top-down approach AI transparency, arXiv:2310.01405 by the Center for AI safety. Somehow injecting happiness (or whatever other emotion vectors?) to make the model more compliant and in a good mood… (What…?) “Large language model understand and can be enhanced by emotion stimuli.”
“Give an emotion prompt at the end of your request like, `This is very important for my career’ performance across a range of models, on a range of tasks, improved notably”

Bletchley declaration signed by 28 countries inc. China
28 Countries Agree To Safe and Responsible Development of Frontier AI

Guardian article — “UK-US-EU-and-China sign declaration of AI’s catastrophic danger

Nov 14th, 2023 – Great news – EU rejects mass scanning of private communications

The EU Parliament `Civil liberties’ committee ruled out mass scanning of private citizen data, across the board in European countries. Given that the UK and the US have started, or have given indication of efforts in that direction, this is a very good result by the EU parliament.

This is particularly relevant if we take into consideration that, as posted above on October 19th, 2023, there was an ongoing tension in AI regulation in the EU parliament. The member states which for a citizen-based regulation, while the EU parliament prefers tech-based regulatory efforts. The discussion on October 19th, suggested that there could be voices in Brussels, hoping to delay firming up the EU AI act until the European Parliament elections of June 2024.

This news today (November 14th, 2023) is a hopeful indicator that there are enough “hardworking and persistent voices of technical experts in the human rights community” like Meredith Whittaker, President of Signal app, shared below.

Nov 15th 2023 – Jaron Lanier at Bloomberg – proposes solution for deep fakes and AI bias

Jaron Lanier, Prime Unifying Scientist at Microsoft, discusses the positives and negatives of AI technology, the various tech business models, and differences of policy in big companies in US.

Below are loosely translated excerpts of the interview, that I found the most interesting. (All mistakes are mine)

In the late 70s and into the early 80s I had a mentor named Marvin Minsky (Co-founder, MIT AI laboratory) for whom I worked as a young researcher a teenager and he was the principal author of the way we think about AI these days. A lot of the tropes and stories and concerns come from that time, from Marvin.

I always thought [`AI overlords’] was a terrible way to portray the technology. AI is just a way for people to work together, it’s just people behind the machine. We’re confusing things, by pretending there is a genie in the machine, and hiding behind masks. Why are we conveying the technology like a mysterious entity, like an artificial God or something.

The answer to deep fakes is provenance, if you know where data came from, that is generating this output, you no longer worry about deep fakes because you can ask where did this come from. The provenance system has to be robust.

Output provenance and data provenance is the only way to combat AI fraud. I actually think that regulators should be involved.

The regulation system where AI regulates AI, and AI judges AI, becomes an infinite regress. If instead you say regulation is based on data provenance we have a concrete action. We are not using terms that nobody can define, that nobody knows what they mean. Everybody says AI has to be aligned with human interest but what does that mean?

Everybody in AI companies of any scale is saying actually we kind of do want to be regulated, this is a place where regulation makes sense. Data and output provenance can play a major role in regulation.

November 18th, 2023 – Sam Altman leaves OpenAI

OpenAI’s board of directors announced that CEO Sam Altman has been fired and will be leaving both the company and the board, effective immediately. Chief Technology Officer Mira Murati has been named interim CEO.
Altman’s ousting reportedly follows an internal “deliberative review process” which found he had not been “consistently candid in his communications with the board, hindering its ability to exercise its responsibilities,” the company announced. As such, “the board no longer has confidence in his ability to continue leading OpenAI.”

Altman has just a few days ago led the DevDay presentation introducing GPT-4 Turbo, and the new features of OpenAI. First words I heard were that Microsoft stock dropped 1% already in the last few minutes before market closure.
Gary Marcus twitted: “Greg Brockman resigns too. Something smells bad. CEO fired, president resigns, same day”.

“OpenAI President Greg Brockman, who helped launch the artificial intelligence developer and has been key to developing ChatGPT and other core products, has resigned, according to a person with knowledge of the situation. The move came after the company’s board fired CEO Sam Altman earlier Friday.” in news piece at:

Greg Brockman’s tweet around midnight UK time.

November 1st, 2023 – Sam Altman awarded the Hawking Fellowship Award at Cambridge Union (posted on November 18th)

Sam Altman’s take on AGI (and new physics!) on the Cambridge seminar, around 1hr:02mins
(In the context of his just having left OpenAI)

“There are more breakthroughs required in order to get to AGI “

Cambridge Student: “To get to AGI, can we just keep min maxing language models, or is there another breakthrough that we haven’t really found yet to get to AGI?”

Sam Altman: “We need another breakthrough. We can still push on large language models quite a lot, and we will do that. We can take the hill that we’re on and keep climbing it, and the peak of that is still pretty far away. But, within reason, I don’t think that doing that will (get us to) AGI. If (for example) super intelligence can’t discover novel physics I don’t think it’s a superintelligence. And teaching it to clone the behavior of humans and human text – I don’t think that’s going to get there.And so there’s this question which has been debated in the field for a long time: what do we have to do in addition to a language model to make a system that can go discover new physics?”

November 18th, 2023 – this was one of the toughest days and Gary Marcus sums it up quite well here.

This post: `This is no way to live‘ says it all.

November 19th, 2023 – Interim ArXiv during `OpenAI’s weekend’

“Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models”
Bad news for watermarking a model’s output so it can later be identified as a product of that model…

November 19th, 2023 – This is my opinion at weekend closing time.

Gary Marcus has been producing excellent content all weekend, thank you.

That’s all for now (November 23rd, 2023)

Useful source on new model developments

AI Explained
Future Tools
Eye on AI

And a few bonus extras (dated July 3rd, 2023):

  • * Sam Altman (OpenAI CEO) interviews, March 25 2023. Link 1. Link 2.
  • * Possible End of Humanity from AI? May 4 2023 Geoffrey Hinton at MIT Technology Review’s EmTech Digital. Link.
  • * Release paper for GPT3: White paper `Language models are few shot learners”, 2020. Link 1. Link 2.
  • * OpenAI released GPT-4, a large multimodal model that accept image and text inputs and emit text outputs. It achieves human-level performance on various professional and academic benchmarks. March 2023 onwards. Link.
  • * A survey of Large Language Models, v11 June 29 2023. Link.
  • * The LLM collection — Prompt Engineering Guide provides a summary compilation of notable and foundational LLM Models (frequently updated). Link.

Social media & sharing icons powered by UltimatelySocial