(Last updated: November 23rd, 2023)
This is a short introduction of why I started this page. I cite many sources but all mistakes are my own, and all errors are my own responsibility.
I am part of a discussion collective of intellectuals brought together by Silicon-Valley tech companies, jointly with companies in academic and scientific publishing. The discussion group is simply a town hall where members from very distinct academic and technological backgrounds can bring up, and discuss any developments in technology, science or industry.
This might include the discussion of key ongoing societal, or economical, transformations anywhere on the planet. Through this discussion group I was aware of the ChatGPT technology mainly since something like the Summer of 2022. However I did not pay proper attention until December 2022, when we held our first video meeting on the subject.
Roughly, by mid April 2023, and also prompted by Jaron’s New Yorker piece here below, I had learnt enough about the ongoing events, to become convinced that such developments merited devoted attention by academics. I decided to use my own expertise as a theoretical physicist, cosmologist, and data analyst, to look into the generative AI technology. I wanted to read the software, and the data being acquired and read, in order to be able to assess, for myself, the capability and behaviour of the upcoming technologies. I guess this is how a physicists thinks, when confusion abounds, in theory or data contexts, we like to boil it down to: “What do we actually know for a fact?”
Our priority in these discussion groups, in which this knowledge is circulated, has always been to not generate unwarranted frenzied panic. However, it also quickly became clear that wide-audience education on the rapidly advancing AI technology would play a big role in a smooth transition to a new society that we are about to witness / are already witnessing.
Below is a step by step itemisation of the developments ordered by date. They contain the sources I found the most useful, as 2023 unfolded and development of the generative (large language) models were rolled out.
I have to thank a team of tireless, devoted, and engaged colleagues and friends, from all corners of society, who were vital in providing much of the content I was privileged to have access to, in real time. All opinions and mistakes are my own.
January 20th, 2023 — Google Layoffs include many voices of Open Source software, including Chris DiBona (Director and Founder of Google’s Open Source Program Office, OSPS), Cat Allman (Google Open Source Program manager), Mike Frumkin (Founder of Google’s Accelerated Science Team).
The layoffs happened suddenly, announced by email, without any notice, and included some of the strongest, most brilliant individuals in the company. I received this news very seriously as did many of us in these circles. Most unsettling was the realisation that some of the most talented minds, DiBona, Frumkin, Allman—that have worked inside big tech towards just and open technological societies for decades—could lose their positions as gatekeepers of software security and trustworthiness in the very groups they had founded.
Personally this was the first alert that a new, significant, and disquieting change inside big tech, might be about to happen, and most likely already underway. Knowing personally the quality of the individuals in question gave me a good grasp on the magnitude of power that was being called upon in order to carry through these layoffs and implement and bring forth the new technologies.
January 20th, 2023 marks for me a turning point in realising that a change in trajectory of the planet was taking place. I remain hopeful that we will be able to reverse this trajectory, as societies and as a whole.
“The AI Dilemma“: Center for Humane Technology, March 9th, 2023, with Tristan Harris and Aza Raskin.
This video contains the most up-to-date source of information (until August 6th, 2023–scroll below for more info on “ The Otter moment”). This is also in my view the most authoritative and well prepared content. Harris and Raskin have on March 9th, already anticipated many of the developments that have taken place subsequently– at least 5 months ahead of their time.
Key moments:
1. 2023 is the year when all content-based identification will break down. We might be moving towards a future in which we only trust those with which we interact personally, or in direct line of sight.
2. The technology is in accelerated development. We do not currently have a means for measuring the rate of acceleration.
3. Extractive technologies.
Here are the citations mentioned in The AI Dilemma:
Emerging Tech Trend Report, 2023, Amy Webb at SXSW 2023, March 29th, 2023.
“Futurist Amy Webb, CEO of the Future Today Institute and professor at NYU Stern School of Business, provides a data-driven analysis for emerging tech trends, and shows perspective-changing scenarios for the future.”
“There is no AI“, Jaron Lanier (founding father of Virtual Reality) April 20th, 2023, New Yorker Article.
“Risk-based methodology for deriving scenarios for testing artificial intelligence systems” by Barnaby Simkin – NVIDIA – April 2023
Best derivation of principles for AI regulation I have seen till today (September, 5th, 2023), as far as a concrete strategy and clear outline are concerned.
My own opinion is that the interests of industry’s stakeholders need to be taken into account as well as the civil society’s interests and policy makers’ interests.
I believe we need to bear everyone’s interests in mind, during regulation, otherwise cooperation will not succeed. We must not overlook the financial interests of those we are trying to regulate or they will side step and assign the whole regulatory enterprise to legal limbo.
“How humanity can defeat AI“, Jaron Lanier, interview by UnHerd channel, May 5th, 2023.
Key moments:
1. Lanier states that despite being employed by Microsoft he has an agreement with the company in which he is free to speak his mind, but also does not speak for Microsoft. He enjoys academic freedom, with regards to the technologies under discussion.
2. Jaron states that the mathematics behind the large language models is “embarrassingly simple”. This is essentially the product rule of likelihoods (used in basic statistics) and is confirmed by Perimeter Institute’s Roger Melko’s lecture in May 2023, posted below.
The complex behaviour of the language models is a sign of the large number of free parameters to be fitted—the long file of order 10^{12} weights, obtained when the models are trained— as well as some clever ways to interconnect those degrees of freedom.
Perimeter Institute for Theoretical Physics‘ Roger Melko – computer science – May 2023.
Melko, R. (2023). LECTURE: Generative Modelling.
3 days of lectures of technical derivation of the mathematical machinery behind large language modelling (generative AI). Good introduction for physicists and data analysts.
May 8th, 2023.
DOI : 10.48660/23050140
40mins – in Large Language models the bottle neck is the training cost. GPT3 cost USD 20 million to train. GPT4 (not disclosed) possibly USD 100 million cost to train.
42mins – Roger agrees that the number of parameters in learning technologies must not exceed the quantum gravitational cap on information stated by the Black Hole entropy result.
May 9th, 2023.
DOI : 10.48660/23050097
May 10th, 2023.
DOI 10.48660/23050095
Min 38: Melko starts to explains the architecture mathematics behind LLMs
Min 39: Gives the mathematical rule for the joint distribution estimator of the data vector, v, (visible units). The estimator maps the visible units of the data vector to a sequence, using the chain rule of probabilities. This portrays the autoregressive property of the models, and is one of the most powerful properties of LLMS.
Min 40: Sec 31 — Using a Stars Wars example, “ May the force be with you” Melko provides the simplest explanation for the behaviour of the functioning of the LLM I have seen to today. LLMs are probabilistic reasoning.
Summary:
Melko describes how the technology involved in large language pre-trained models is a predictive text technology, which operates on a word-by-word basis. In particular Melko explains that LLMs are overparametrized / under-fitted. This means that the number of free-parameters in a fitting model is larger than the number of parameters that the available data requires to be fit to, or explained. In a typical pre-trained model there is not enough (natural/digital) data to estimate the likelihood function of the data distribution by usual MCMC methodology. As a result the statistical rule used to calculate the likelihood of a given data vector is the chain rule of products.
White House “AI risk management Framework”, National Institute of Standards and Technology (NIST) by US, Department of Commerce. White House release of the 2023 updated National AI R&D Strategic Plan. It does sound a little simplistic particularly when compared with (NVIDIA’s) Barnaby Simkin’s several-stage risk-assessment framework, released in a IEEE zoom above, one month before this White House release.
AI risk, a view from ex-googler Mo Gawdat, June 1st, 2023.
Business Insider article.
Randy Fernando, Center for Humane Technology, AI Town Hall, June 7th, 2023. (An update to the earlier Harris-Raskin presentation above).
Meta VoiceBox `too risky to release’, June 16th, 2023:
press release,
research post,
academic research article (Facebook Research).
DisrupTV-327 discussion panel, June 24th, 2023
Panelists:
David Bray (Distinguished Fellow – Stimson Center and Business Executives for National Security)
Divya Chander (Anesthesiologist, Neuroscientist, and Data Scientist)
Megan Palmer (Senior Director for Public Impact at Ginkgo Bioworks and Adjunct Professor of Bioengineering at Stanford University).
Very informative seminar content-wise, and interesting discussion on AI applications, Synthetic biology and its applications in health and medicine and in adjacent fields.
Key points:
* Chile has just passed a “neural bill of rights” issuing citizens with the right to choose what data gets uploaded into and downloaded from their brains.
* Since there is a shortage data to train large models, we can start using data outside of the human species and take advantage of data generated by the biosphere.
I finally managed to complete the summary of this key discussion above:
14mins: Emphasize that large language models are predictive text engines they don’t have *knowledge* of facts versus lies. They are just simply filling in text based on the data they were given. (A little bit more advanced than Ouija boards).
15mins: We need a immune system for the planet: the equivalent of smoke-detectors for the biological space.
Like biological sensors: in the 1900’s we had a problem of buildings that could catch on fire, and hurt people. The answer was private companies built smoke detectors that alert if there is smoke in the building and call the appropriate fire department.
17mins We will get to an era where we will need alert systems to signal if something is in the building where it shouldn’t be. What do we do, what is the alert notification and who’s responding.
17:10 Like all technologies this has a tremendous force for good, but we also need to be ready for when it might be used for less good purposes. We’re definitely going to see that.
17:30 There are some real world application right now. Synthetic biology is giving base chemicals that are the building blocks for everything: new cells, enzymes, diamines. These are the Lego blocks. There are positive disruptions like using metabolic engineering and genome editing to provide food, energy, medicine, preventing pandemics, and even adjusting climate change.
19mins Biotechnology is already invented itself and we’re drawing from its toolbox. New tools of synthetic biology are allowing us to decouple biological design and biological engineering. That means we can organize companies in new ways: instead of just industrializing biology we biologize Industry. A new footprint for industry that is based in community and enabling everyone everywhere to harness capabilities of the technology. But we also need to make sure that we bake in the strategies for safety, social responsibility, security, and sustainability into that type of footprint in synthetic biology.
22:40 All machinery that biology uses inherently does storage better than anything technology invented by humans. It manufactures on its own and it doesn’t require a lot of energy to do it it’s low power, high efficiency, has low error rates or wouldn’t be able to pass on our DNA from generation to generation, and it’s able to play nice with other biological systems.
23:20 Most of the ways we will achieve longevity is by dealing with the chronic disease that we have. We can use gene editing to modify pathways that result in cancer.
24:10 Solve a disease process at the level of the embryo, the child doesn’t grow up to express that disease. If you edit the sperm line of the embryo that results that gets passed on downstream.
24:30 And everything has dual use. Think about the possibility of engineering our race for desired traits. That can create a race of haves and have-nots. This needs thinking when we’re democratizing the technology. There are ethical, social, regulatory issues.
25:40 Use AI for discovery and design in synthetic biology, and then print out beautiful 3D structures.
26:10 We need technologists, regulators, the government, and business people working together to create data stewardship and protections. There are ways to use bar coding, organisms that identify who did what, if there was a third party actor, that might not have the best intentionality nor authentication.
28:30 The ethics we need to think going forward for biological data are similar to the ethics we need to think about for AI systems, and are also similar to data from low-orbit satellite observations from space: do you want to have somebody seeing what you’re doing from space.
29mins: In 2017 UK proposed the Data Trust: data cooperatives where people say I give permission for my data *collectively* to be used in a specific fashion. In the AI space we don’t have this. We don’t know which data did OpenAI use to train its models.
30mins: We need a Global Community Commons for Biological Data
32: there’s so much more biological data out there that we haven’t discovered, there is human DNA, but that’s only one species, there’s a whole toolbox out there yet to be discovered, can be part of the Community Commons of data.
36mins: What is the very basis of what it means to be human. There are Data Trust models but we are also data producers. We leave our genetic material everywhere we go, we exhale in our breath, our skin changes color, we have a neural code.
36:40 We can take that to the adjacent place of neural rights and neural sovereignty. There are systems of informed consent. Chile is the first country in the world that passed a bill of neural rights as part of the Constitution. It is an idea of informed consent gaining momentum: if you are reading from my brain or write to my brain you need to do so with my full consent.
37:30 How to develop neuro-technology in the consumer space?
First: Propose mechanism where people can transact with their own data and have agency at the edge.
Second: You can consent to your data being used but can revoke consent.
Third: Turning your data into a human right.
38mins: We are now submitting to the United Nations, who are concerned with neural rights. Redefine that the data coming from your brain is a human right. This can speak of freedom from manipulation because the technology can manipulate what you do and how you act. These rights in synthetic biology are adjacent to AI.
39mins: I have the right to be disconnected and not to be seen as a terrorist. Could I even disconnect and transact in this world? With cash which is the only anonymous form of value exchange. Such issues will have bigger and bigger policy implications and possibilities for manipulation.
40mins: We have six or seven different technological revolutions happening in parallel all of which that vague ethical and society questions. It is entirely unclear who is going to set the standards.
40:20 There are 54 different AI policies at the moment for 54 different countries, with no coordination amongst them (data from JP Singh at Institute for Sustainable Earth at George Mason University). And it is only going to get worse.
40:30 How does Salesforce feel like they have to navigate 54 different policies for AI?
Smaller countries that don’t have legacy burdens are legislating faster (unlike the EU and US). These are leapfrogging directly ahead. We are surprised when countries one would never expecte actually show a Better Way in doing AI than the ones you would expect.
42mins: Critical in the field of regulation is the biological weapons convention. The digital data is very cross jurisdictional. Companies might want to get a UL listing that protect the digital data and how it’s being sourced and extracted (and ingested).
[UL Listed means the product meets nationally recognized standards for sustainability and safety.]
44mins: You can get a stamp of approval that certifies the company has a sustainably sourced data company. The data company earns the trust of consumers and in the same way a fiduciary responsibility.
46mins: McKinsey Global report on the value of the synthetic biology and AI market of order trillion US$. There are ongoing efforts by the US government, NIST, the Technologies Department of Commerce and others to develop better metrics to estimate the current value of the bio economy.
A significant leap forward on estimating the value of the market on this industry was the announcement of an executive order on biotechnology and biomanufacturing, and legislation in chips and science that is enabling the coordination of activities and also accounting for the growth over time. Now we have loose metrics now and we’ll have much better ones in the very near future.
47mins: The human body is a reservoir of so much data, and there is so much efficiency in its storage. Today the scale is exabytes maybe Yotabyte, $10^{24}$ in the future. But we’re at a point where it’s almost $10^{99}$, that’s what the human body’s doing, with efficient storage. Nature provides a lot of great models for computer scientists to see how living systems end up being much more efficient with energy consumption, and with how we learn. It’s all sitting right there and it’s fascinating.
If we imagine it’s 2033 what are the hopes for synthetic biology?
Biomimicry of materials (spider silk, self-healing skin), 3D organoids, stem cells, neuromorphic computing and nothing in computing has come close to what the human brain can compute with the power of about a sigh bulb. The architecture of our brain and the way it deals with noise is beautiful.
Lastly upcoming advancement in editing the genome: being able to switch the epigenome “on” and “off” to enhance our ability to survive and coexist and cohabit in an environment. Turn features up and down to enable that survival.
53mins: I hope that in ten years we have a science of systems, in which we understand how different systems layer on top of each other. We can amongst others understand how the variety of systems on the planet connect. The good news is we’re eight billion people, the challenging news is we’re eight billion people. I wish for a science of systems that understands and moves out of the Maslow hierarchy of how things correlate and actually be predictive. That would be opposed to economics, which game theory shows is correct only 30% of the time (meaning economics is wrong 70% of the time).
55mins: This industry is going to impact our lives for the next century, it is probably the innovation of the century. And more importantly we need a plan!
AI development on filmmaking and acting, June 29th, 2023, The capabilities of the technology seem to have reached, or are about to reach seamless human-actor, voice, imagery, and footage, together with the insertion onto contemporary feature-length movies, as well as past ones.
“What my musical instruments have taught me”, Jaron Lanier, New Yorker, July 22nd, 2023. In this article Lanier states that reality is incompressible. In my view this implies that AGI is not likely to be achieved.
— (Nov 21st, 2023 note: I will have an academic statement of this shortly, in the next revision of my latest article on energetic causal sets, (with Vasco Gomes and Andrew R. Liddle) my quantum gravity model proposed here (with Lee Smolin).
“If you work with virtual reality, you end up wondering what reality is in the first place. Over the years, I’ve toyed with one possible definition of reality: it’s the thing that can’t be perfectly simulated, because it can’t be measured to completion. Digital information can be perfectly measured, because that is its very definition. This makes it unreal. But reality is irrepressible.”
I wrote this post on Jaron’s article (August 2023) related to art and AI around the same time. Here is an earlier art and AI point of view from April 2023.
At the beginning of August 2023 a new turn of events would change the scenario of the extent of reach of transformative AI technologies. Namely online meeting platforms changed their terms and conditions to allow for freer inclusion of generative AI software in collaborative meetings.
Around August 6th, 2023 the online-meeting platform “Zoom Video Communications, Inc.” updated their terms and conditions. Amongst other new Zoom features this update enabled the inclusion and widespread dissemination of a collaborative technology known as “Otter AI“. Otter AI is a note-taking piece of software for usage in online collaboration (based in Mountainview, CA). The implications of this particular technology, particularly in what regards, its design, and default settings, have consequences, that as of now, I do not see we could have anticipated with the content that Harris and Raskin shared.
On August 7th, 2023, the platform representatives denied that this update of terms allowed for third-party model training on data content owned by the Zoom application.
Some background context regarding Otter AI and its history of interaction with the Zoom corporation (quoting from the platform’s website):
“The following changes have gone into effect on September 27th, 2022 for the Otter Basic plan.
OtterPilot will be included in the Otter Basic plan. Users will be able to have their OtterPilot automatically join meetings for Zoom, Microsoft Teams, and Google Meet to automatically record and transcribe in real-time. Users can easily access their notes, even if they can’t join the meeting. Learn more about OtterPilot.“
August 27th, 2023 — New-Text to-Video developments by One Prompt
Do pay close attention to the pace of development in the technology. This capacity was already hinted at by Emad Mostaque, of Stable Diffusion a few months ago in late March 2023, I paste the link of the discussion with Emad Mostaque at the conference Abundance 360 here in the next section, since I only saw it recently, in mid August.
Stability AI’s Emad Mostaque at Abundance 360 March 20th-23rd, 2023 at the Abundance360 (A360) conference.
Emad Mostaque appears to be quite spot on and very sharp with content.
He gave this interview at the end of March, I highlight a few statement in bullet points here below, I did not get through the entire talk.
I was very surprised by his views held at the end of March this year, which are far ahead anything I was thinking at that time.
The messages in Lanier’s seminar might be the most lucid and sober content on the AI context I have heard in months.
In this lecture Lanier explains how the assignment of human characteristics to a group of software models has historically been documented as potentially leading to nefarious consequences for the organisation of societies.
3mins
He cites Norbert Wiener was a opponent of Marvin Minsky, a pioneer of robotic and AI at MIT. Wiener wrote a book called `The human use of human beings‘ in the post-war period, about this phenomenon. He thought that if we personify machines of this kind too much you might end up with some people exploring other people. At the end of `The human use of human beings‘ there is a thought experiment:
“What if someday there could be a small device that was connected by radio signals to a big cybernetic device, which these we might call a Large Model, or a neural network or something, and what if that device had information about you, and was following you, and entered into a feedback loop, that might manipulate you for the benefit of whoever owns the central device? That would be the end of civilisation, it would just make us insane!”
Obviously we built that thing!
29mins
In regions of parameter space where the dataset is sparse, there is not much antecedent data, and output might be bizarre. Wouldn’t it be great to say there’s a commercial opportunity, add dat of this kind to the big model. You can make some money, you can earn some glory, some recognition, people will know you did it. But now, there is no way to do it, there’s this sense of this creature producing outputs, the AI, and we don’t know how it does it. Why should we keep this mystic?
30mins
When people grow up on science fiction stories, it become their vocabulary. And so just again and again “I have to create sky net. Or I have to create those agents in the Matrix movies“, or whatever it is. But there is no reason to think about it that way.
32mins
More people should define their lives as creative lives, rather than lives driven by a narrow necessity
of one sort or another, which is what we have today. What is the better idea? What do we want from all this technology?
40mins
If ChatGPT could be considered a social collaboration, instead of a mysterious creature, if you were your inverted way of looking at it, how do we reckon with the fact that most of the training data is heavily white and male and western?
By making the training data explicated!
What should happen is when you get a result, an output of the system, is that you should be able to get a characterization of the key antecedent examples that influenced your result.
The problem when we give the impression that GPT is an oracle, this mysterious, infinitely large oracle–that has a trackless interior that no one can interpret–is that, when you then complain about bias, your only option is to try to slap another AI on the output to try to catch the bias, which gets you back into the genie problem.
42mins
If we pretend that the antecedent data is some trackless, impossible to know, gigantic mush, and we can’t even talk about it and all we can do is try to moderate it on the output.
We’re putting ourselves in a needlessly difficult position. Why can’t we be motivated to make the training data work data work better for society?
43mins 51secs
I don’t know if the US will be around in a couple years. I’ve never felt that way before in my life. It’s horrible and it’s happening all over the place. This is something I’ve written a lot about and when I started writing about it, nobody believed me and now it’s normal to worry about this.
At any rate, what we’re all terrified of now in the industry and everywhere is that if the current generation of generative AI is used to make people insane, like for upcoming elections we’re really in a tight spot.
46mins
We tech people make ourselves insane with our own inventions. Twitter made Elon Musk insane in a way that he didn’t use to be insane.
47mins
And so we have a big problem, we’re gonna go through harrowing times in the next few years. I believe if we can make it through the next few years, we can make it through a long time after that. At least as far as these issues go, then we still have the climate and everything. But I think we’re about to, we’re gonna go through a very difficult crunch time here.
47mins:35secs
UC Berkeley could accept some funding to do provenance research in AI, be the people who look at the technical feasibility.
Tristan Harris above describes how the Center for Humane the story started in January/February this year, 2023, when they started to receive calls at the Center for Humane Technology. This is the same time January 20th to be precise (see Google Layoffs at the beginning of this page) that myself, and a few others, became aware that there was an ongoing issue that would attention and `horizon scanning’.
Ten months onward we all are quite exhausted with the burden of knowledge, of the incessant pace of the AI technology. (I am writing this on October 26th, 2023.) Adding multiple zones of conflict scattered around the planet, on top of the ongoing AI race, is proving quite a challenge for our human minds. We are only human after all. I do hope, as I write this, that so many colleagues and friends, that have inspired me, shared their knowledge, and resisted quite a few of the technology standoffs plus other state of affairs, will get a rest, go for a walk, sit in nature, hug our families. Someone I admire a lot, Brian Germain at Adventure Wisdom said
”A hungry farmer feeds no one.”
Brian Germain is an author, teacher, entrepreneur, inventor, test pilot, psychology researcher, keynote speaker, and world champion skydiver. He was featured in this article for example. Let me use a skydiving expression to sign off for today: “Blue skies”
There is also the reference to real-time brain scans. (Real time thought processing). I am not sure if someone who is not a physicist will get this, but I thought immediately of tinfoil hats from Signs (2002). [Come on, we also have to have some fun, and we have been doing this, analysing these technologies for the best part of one year now.]
Not everyone got my tinfoil meaning so I will write it down here. The idea is that non invasive brain scans like the one invoked in the State of AI report video (and this is just the first *two* minutes) and also available on our very own arXiv, ultimately proceeds through good old-fashioned electromagnetic radiation a la James Clerk Maxwell.
I have not conducted any extensive studies but (hypothetically speaking) the two immediate ways that come to mind that would shield one’s brain from electromagnetic radiation are tinfoil or a microwave oven. Plus the latter is a bit dodgy for reasons that can be clarified involving quantum information. (If you happen to be a MSc Physics student looking for a project do get in touch.)
I think the fact that no single physicist understands what information really is, including myself at the frontline of ignorance, is something that we really like to keep for ourselves. That might mean this paragraph will disappear soon, or not!
The following was outlined in discussion at the October 19th meeting
“The Future of Data, Bio, and Algorithms”
The recording is accessible here.
There is an ongoing effort on the side of regulation actors (across US and EU, I believe) to delay the drawing up of the EU AI act, until the European parliament elections in June 2024.
The current EU parliament is tied up in tension between a tech-based regulation orientation versus citizen-based regulation. The former is preferred by the member states whereas the latter is preferred by the EU parliament.
Since the member states prefer tech-based (meaning put the interests of the industry first) if the AI act can be delayed, hoping that the new EU parliament elected in June will defend citizens less, then the interests of industry can be fulfilled.
It is also argued herein that, historically US regulation (like privacy rights) tends to follow regulation in the format that the European Union formulates.
So, in this discussion below, the stance being presented was the US watching the EU waiting for this “act”.
I was only able to attend for a very brief 20 minutes of a very smart discussion between
Marc Martin, Partner, Perkins Coie,
Moderator Cass Matthews, MICROSOFT Office of Responsible AI
Benoit Barre, Partner, Le 16 Law, European Union
https://www.digitalhollywood.com/ai-bill-of-rights—session-four
The very smart moderator, Marc Martin elicits good answers from both the EU participant (Benoit Barre) and the US participant (Cass Matthews).
On a personal note, from what I have seen in the technology, extreme care is due already in “general purpose AI‘’ meaning narrow AI. The systems are taking advantage of vulnerabilities in society that we are simply not prepared for.
I do not subscribe to terminology like super-human or under-human.
I think it is better to discuss of the technology like a very serious weapon. It is not human, nor is it alive, nor will come alive, but it takes advantages of vulnerabilities we did not even know we had.
Globally speaking we need to approach and regulate this technology as we approach and regulate nuclear technology.
In short, I would oppose anything ongoing in the European Union (or elsewhere) that remotely refers to, or implies a stalling of regulation. Abstract, philosophical discussions that make us more confused are reaching this very goal.
I do not trust philosophers at the AI regulation chair. This is not an academic debate.
President Biden’s and vice-president Kamala Harris LIVE announcement from the White House
Agencies get marching orders as White House issues AI-safety directive. The National Institute of Standards and Technology is ordered to draft red-teaming requirements, the National Science Foundation to work on cryptography, and the Homeland Security Department to apply them to critical infrastructure.
A blueprint for AI bill of rights was issued by the White House Office of Science and Technology Policy (OSTP) “MAKING AUTOMATED SYSTEMS WORK FOR THE AMERICAN PEOPLE”
Statements therein:
Stuart Russell’s
“Truth decay”
“The AI revolution is worth quintillion dollars” (USD 10^18) -> needs double-checked.
(What we have been slow to cotton on is the amount of money involved in this industry in the upcoming future. Since January we have asked ourselves multiple times, how much profit could be involved to justify such disruption. In January 2023 I speculated that the only possible reason to motivate such proliferation of developments would be a change of the level of our civilisation as characterised by the Kardashev Scale of level of technological advancement based on the amount of energy it is capable of using. )
“Policy software is impossible” around 26mins in
“European AI act – has a hard ban on the impersonation of human beings, you have the right to know if you are interacting with a human being. Easiest, lowest hanging fruit that every jurisdiction in the world can implement immediately.” Around 27 mins in
“Opt-out built in, a kill-switch button, remotely operable and non-removable. This is a technological requirement on open source systems, if you make a copy of software the kill switch needs to be copied as well. Implies more regulatory controls on open source system than on closed source.”. Around 28mins
“Red lines. We do not know how to define safety, but we can scoop out obvious forms of harm:
Self-replication of computer systems or hacking into other systems is unacceptable”. Around 29mins
Special part: they ask “When are all the smart people in the world going to quit what they are doing and start working on this?” Around 31mins
“Nuclear chain reaction. How to keep the reaction subcritical, and from going super-critical and becoming a bomb. A mechanism with negative feedback control system with moderators to keep the reaction subcritical”. Around 34mins
“AI should not be politicised. Bipartisan agreements in place in the US, might be failing in the UK. The political message should be uniform: about being on the side of humans or AI overlords. Raise awareness but not in a partisan way.” Around 36mins
Andrea Mio
“Very powerful big tech companies who have extreme lobbying power and control over governments” 51 mins
Mark Brakel,
54mins China has not been invited, but it might still be invited to attend the AI safety Summit, aiming at making this an inclusive summit in that nations of the world will get a seat.
Base the Summit on examples of Large scale AI harm that we have already seen such as the Australian robot death scandal, or the Dutch benefit scandal.
Least optimist on the role of big tech companies present at the summit. Responsible scaling might be being used as an excuse by many companies as an excuse to keep going.
Max Tegmark (MIT) Get the point of view from civil society and academic groups do not profit from AI technology 1h09mins
Panel:
Stuart Russell, (UC Berkeley)
Max Tegmark (MIT), theoretical physicist, Future of Life Institute
Andrea Miotti,
Jaan Tallinn, co-founder of CSER, (co-founder of Skype)
Annika Brack, The International Center for Future Generations
Hal Hodson, journalist, (astrophysicist)
Ron Roozendaal, deputy DG on Digitalisation for the Dutch Ministry of Interior and Kingdom Relations,
Mark Brakel, Director of Policy at Future of Life Institute,
Alexandra Mousavizadeh, (economist)
More info: https://lu.ma/n9qmn4h6
Around 5mins Integrate all modalities in a single interface.
Around 6mins Regulate actions, not outcomes.
Scaling policies: What is your scaling limit? Asked of Big Tech companies.
NVIDIA‘s CEO: Increasingly greater agency is only sensible with human-in-the-loop pipelines. Ability for AI to self-learn, improve and change out in the wild, in a digital form is unthinkable. #humanintheloop
The word of the day is to red team the technology, employ world experts to find security vulnerabilities in the software.
Commitment from Anthropic: If they found that any future models pose cyber security, biological weaponry or nuclear risk, then they commit to not deploying that nor scaling, until the model *never* produces such information, even when red teamed by AI world experts, or using prompt engineering when special techniques designed to elicit the worst behaviour.
KEY MOMENT
Around 13mins: Representation engineering: A Top-down approach AI transparency, arXiv:2310.01405 by the Center for AI safety. Somehow injecting happiness (or whatever other emotion vectors?) to make the model more compliant and in a good mood… (What…?) “Large language model understand and can be enhanced by emotion stimuli.”
“Give an emotion prompt at the end of your request like, `This is very important for my career’ performance across a range of models, on a range of tasks, improved notably”
Guardian article — “UK-US-EU-and-China sign declaration of AI’s catastrophic danger”
This is particularly relevant if we take into consideration that, as posted above on October 19th, 2023, there was an ongoing tension in AI regulation in the EU parliament. The member states which for a citizen-based regulation, while the EU parliament prefers tech-based regulatory efforts. The discussion on October 19th, suggested that there could be voices in Brussels, hoping to delay firming up the EU AI act until the European Parliament elections of June 2024.
This news today (November 14th, 2023) is a hopeful indicator that there are enough “hardworking and persistent voices of technical experts in the human rights community” like Meredith Whittaker, President of Signal app, shared below.
Below are loosely translated excerpts of the interview, that I found the most interesting. (All mistakes are mine)
2mins:17secs
In the late 70s and into the early 80s I had a mentor named Marvin Minsky (Co-founder, MIT AI laboratory) for whom I worked as a young researcher a teenager and he was the principal author of the way we think about AI these days. A lot of the tropes and stories and concerns come from that time, from Marvin.
2mins:40secs
I always thought [`AI overlords’] was a terrible way to portray the technology. AI is just a way for people to work together, it’s just people behind the machine. We’re confusing things, by pretending there is a genie in the machine, and hiding behind masks. Why are we conveying the technology like a mysterious entity, like an artificial God or something.
16mins:44secs
The answer to deep fakes is provenance, if you know where data came from, that is generating this output, you no longer worry about deep fakes because you can ask where did this come from. The provenance system has to be robust.
17mins:07secs
Output provenance and data provenance is the only way to combat AI fraud. I actually think that regulators should be involved.
17mins:30secs
The regulation system where AI regulates AI, and AI judges AI, becomes an infinite regress. If instead you say regulation is based on data provenance we have a concrete action. We are not using terms that nobody can define, that nobody knows what they mean. Everybody says AI has to be aligned with human interest but what does that mean?
18mins:46secs
Everybody in AI companies of any scale is saying actually we kind of do want to be regulated, this is a place where regulation makes sense. Data and output provenance can play a major role in regulation.
OpenAI’s board of directors announced that CEO Sam Altman has been fired and will be leaving both the company and the board, effective immediately. Chief Technology Officer Mira Murati has been named interim CEO.
Altman’s ousting reportedly follows an internal “deliberative review process” which found he had not been “consistently candid in his communications with the board, hindering its ability to exercise its responsibilities,” the company announced. As such, “the board no longer has confidence in his ability to continue leading OpenAI.”
Altman has just a few days ago led the DevDay presentation introducing GPT-4 Turbo, and the new features of OpenAI. First words I heard were that Microsoft stock dropped 1% already in the last few minutes before market closure.
Gary Marcus twitted: “Greg Brockman resigns too. Something smells bad. CEO fired, president resigns, same day”.
“OpenAI President Greg Brockman, who helped launch the artificial intelligence developer and has been key to developing ChatGPT and other core products, has resigned, according to a person with knowledge of the situation. The move came after the company’s board fired CEO Sam Altman earlier Friday.” in news piece at:
https://www.theinformation.com/articles/openai-president-brockman-resigns-following-ceo-firing?utm_source=ti_app
Sam Altman’s take on AGI (and new physics!) on the Cambridge seminar, around 1hr:02mins
(In the context of his just having left OpenAI)
“There are more breakthroughs required in order to get to AGI “
Cambridge Student: “To get to AGI, can we just keep min maxing language models, or is there another breakthrough that we haven’t really found yet to get to AGI?”
Sam Altman: “We need another breakthrough. We can still push on large language models quite a lot, and we will do that. We can take the hill that we’re on and keep climbing it, and the peak of that is still pretty far away. But, within reason, I don’t think that doing that will (get us to) AGI. If (for example) super intelligence can’t discover novel physics I don’t think it’s a superintelligence. And teaching it to clone the behavior of humans and human text – I don’t think that’s going to get there.And so there’s this question which has been debated in the field for a long time: what do we have to do in addition to a language model to make a system that can go discover new physics?”
“Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models”
Bad news for watermarking a model’s output so it can later be identified as a product of that model…
https://arxiv.org/abs/2311.04378
AI Explained
Future Tools
Eye on AI