The Washington Post analyzed Google’s C4 data set, “a massive snapshot of the contents of 15 million websites that have been used to instruct some high-profile English-language AIs.” The data set includes half a million blogs. It’s not known whether the AIs include ChatGPT.
The Post article includes a search box. So I had to look:
Good grief.
Related reading
All OCA ChatGPT posts
[As the Post notes, “OpenAI does not disclose what datasets it uses to train the models backing its popular chatbot, ChatGPT.” That’s a gift link: free for all to read.]
Thursday, April 20, 2023
The stuff bots are made of
By Michael Leddy at 8:44 AM
Subscribe to:
Post Comments (Atom)
comments: 11
Perhaps asking the ‘bot to create a Saturday Stumper is on order?
0 for my blog. I don't know if I should be relieved or insulted :-)
I shudder to think what some of the clues and comments might yield.
Is it possible to be both relieved and insulted? I think that’d be my reaction.
I’m reminded of Groucho Marx’s “I don’t want to belong to any club that would accept me as one of its members” line.
I want to see your blog as a Klassic Komic!
It would have to have many, many panels. (As would yours, too.)
Thanks for the link to this article. I see they scraped my personal website. But I have total trust in Big Tech, and I'm sure they're going to compensate me for using my copyright-protected material in their for-profit business. I heart Big Tech.
It’s pretty galling, isn’t it? And they also make use of sites that absolutely defy copyrights.
Oh my goodness—I check, and there’s my blog.
How very bizarre.
Once again I am made aware we are living in a sci-fi world.
Thanks for the link.
Well, I tried the tool again and my blog now shows up, but luckily it's at the back of the class in the cloakroom with a ranking of 5,894,945.
It’ll be interesting and creepy to see what it can say about space hoppers and Mr. Moore.
Post a Comment