Legal Concerns: Copyright and Artificial Intelligence

Gold-colored scales and a dark wooden gavel sit on a desk, blurred objects in the background.

My post last week, about Google’s large language model using your website content for training data, captured a lot of attention.

The post highlighted the Washington Post story revealing the millions of sites Google’s data set had already scraped.

It was my most visited post in the past four months.

And shared my frustrations and concerns about copyright.

I’m not alone.

Continue reading Legal Concerns: Copyright and Artificial Intelligence

Is Google’s Large Language Model Using Your Website Content As Training Data?

Results of a search of websites in Google's C4 dataset shows lireo.com ranks 442,028 with 53k tokens, representing 0.00003% of all tokens.

Remember the post I published in late March 2023, with steps you can take to restrict ChatGPT from using content from your WordPress site?

It may not have worked, if your site’s content was already scraped.

Which it did with this site, lireo.com.

Not what I expected.

Continue reading Is Google’s Large Language Model Using Your Website Content As Training Data?

How to Find the Original Source of an Image

My colleague Kelly was working with her client on a new website. And the client found an image they absolutely wanted on their site.

But her client didn’t know where the image came from. Kelly knew she needed to confirm the image could be used on the site.

Where could she start looking? Kelly posted in one of the online forums I belong to: Continue reading How to Find the Original Source of an Image