Legal Concerns: Copyright and Artificial Intelligence

Gold-colored scales and a dark wooden gavel sit on a desk, blurred objects in the background.

My post last week, about Google’s large language model using your website content for training data, captured a lot of attention.

The post highlighted the Washington Post story revealing the millions of sites Google’s data set had already scraped.

It was my most visited post in the past four months.

And shared my frustrations and concerns about copyright.

I’m not alone.

Continue reading Legal Concerns: Copyright and Artificial Intelligence

Photo of the Week: Elegant American Avocet

Large shorebird with copper head and neck and white body with black wings with white stripes pauses as it forages in the water of the mudflat.

A striking cinnamon-brown, black, and white shorebird, the American Avocet stood out on the mudflat of the Lake Erie marsh near the Michigan/Ohio state border.

The slightly-upturned long thin black bill is distinctive as is the white body with white-striped black wings.

The American Avocet is a rare bird in Michigan and one I had never seen before.

Continue reading Photo of the Week: Elegant American Avocet

Is Google’s Large Language Model Using Your Website Content As Training Data?

Results of a search of websites in Google's C4 dataset shows lireo.com ranks 442,028 with 53k tokens, representing 0.00003% of all tokens.

Remember the post I published in late March 2023, with steps you can take to restrict ChatGPT from using content from your WordPress site?

It may not have worked, if your site’s content was already scraped.

Which it did with this site, lireo.com.

Not what I expected.

Continue reading Is Google’s Large Language Model Using Your Website Content As Training Data?