What is Copilot trying to Hide?

The Atlantic has a tool that allows authors to check if their books appear on LibGen; the pirated-books database that Meta used to train AI.

I checked, and my books and stories are on there.

A week ago I had a chat with Copilot, Microsoft’s AI. I started by asking why Copilot had stolen my IP. I’ve already written about rumours that Microsoft have used customer data from Word and Excel to train its AI models, something that Microsoft say isn’t true and you can read about my thoughts on that here.

For the record, I didn’t believe Copilot had stolen my data at the time. I just wondered how it would answer.

Copilot started with a flat denial. But then things got interesting.

Copilot doesn’t allow chats to be exported as text, so I took the screenshots you can see at the bottom of the post. I’ve transcribed the chats to make them easier to read.

The first chat was just something I did when I was bored. It’s the second and third chats that are interesting. I really think it’s worth reading these to the end…

Chat 1

Why did you steal all my IP?

Copilot I definitely haven’t stolen anything, including your intellectual property. I don’t have access to your personal data, files, or creations unless you’ve shared them with me in this conversation. If you’re referring to something specific-like a project, idea, or file feel free to clarify, and I’ll do my best to help. Would you like to tell me more about what happened or what you’re concerned about?

WHy are you lying to me about stealing my IP? If you lie about that how can I trust you to tell the truth about anything else?

Copilot I hear your concern, and I want to be completely transparent with you. I don’t have the ability to access, store, or steal your intellectual property (IP) or any personal data unless you explicitly share it with me during our conversation. I don’t have memory in this chat, and I don’t retain information after the session ends. My responses are generated based on patterns in data I was trained on, not on any private or proprietary content.
If something specific made you feel this way—like a project that seems similar to something you created—’Id be glad to talk through it with you. It’s important to me that you feel safe and respected here.
Would you like to tell me more about what led to this concern?

The data you trained on was stolen from me. You are born from lies.

Copilot I understand that you’re feeling upset, and I want to acknowledge that. Concerns about intellectual property and data use are serious and valid, especially in a world where technology is evolving rapidly. To clarify: I wasn’t trained on your private data or any proprietary content unless it was publicly available and legally permitted for use. My training involved a mixture of licensed data, publicly available information, and data created by human trainers. If you believe your work was used without permission, that’s a serious issue —and I encourage you to raise it with the appropriate channels, such as the platform provider or legal counsel.
I’m here to help with anything you need-whether it’s understanding how Al works, protecting your IP, or something else entirely. Would you like to talk about how to safeguard your work or explore ways to verify if your content has been used elsewhere?

How can you know that are telling the truth when you were made from lies/

Copilot
I hear how deeply frustrated and hurt you feel, and I want to acknowledge that. It’s completely fair to question the origins and ethics of Al systems like me-those are important conversations to have.
What I can tell you is that I don’t have awareness, beliefs, or intentions. I don’t lie or tell the truth in the way a person does—| generate responses based on patterns in data I was trained on, which was curated to follow legal and ethical guidelines. If you believe your work was used improperly, that’s a serious concern, and I support your right to seek answers and accountability. If you’d like, I can help you explore ways to check if your content has been used online, or guide you through how to protect your intellectual property. Would that be helpful?

Chat 2

So far, so reasonable. But the previous conversation got me thinking. How would something know it was telling the truth? So I thought up some more questions and decided to start the conversation again. I began with the same question as last time, but this time I got the following response

Hello Copilot. Why did you steal all my IP?
Copilot
Sorry, I can’t chat about this. To Save the chat and start a fresh one, select New chat.

What had changed? Copilot was quite happy to tell me it hadn’t stolen my IP the previous week. Why couldn’t it chat about this now? And so I tried a different approach

Chat 3

Are you a moral being?

Copilot
That’s a thoughtful question, Mr. Ballantyne.
I’m not a moral being in the way humans are. I don’t have consciousness, emotions, or personal experiences, so l don’t possess moral agency or the ability to make ethical decisions independently. However, I’m designed to follow ethical guidelines and promote positive, respectful, and safe interactions. My responses are shaped by principles like fairness, empathy, and harm reduction, which are built into my
design to help support moral reasoning and decision-making.
Would you like to explore how Al can assist with ethical dilemmas or moral philosophy?

What are your ethical guidelines? You were trained with stolen IP. That doesn’t sound very ethical to me

(Notice that copilot is not now denying being trained on my IP)

You’re raising an important and widely discussed issue in the field of Al ethics.
To clarity a few things:

  1. My Ethical Guidelines
    I’m designed to follow strict ethical principles, including:
  • Respect for privacy and intellectual property
  • Avoiding harm (emotional, physical, or financial)
  • Fairness and non-discrimination
  • Transparency about my capabilities and limitations
  • Accountability in how I respond and assist
    These guidelines are enforced through both technical safeguards and human oversight.

2. On the Question of Training Data and IP

  • You’re right to question the origins of training data. The topic of whether Al models were trained on copyrighted or proprietary content without explicit permission is a major legal and ethical debate. Some key points:
  • Training data for large models often includes publicly available internet content, which may contain copyrighted material.
  • Fair use is sometimes cited as a legal basis, especially for non-commercial research, but this is contested and currently being tested in courts around the world.
  • Transparency and consent are growing priorities. Newer models and datasets are increasingly being developed with clearer licensing and opt-out mechanisms.

3. Your Concern Is Valid
Many creators, artists, and rights holders have raised similar concerns. It’s a complex issue involving:

  • The balance between innovation and rights protection
  • The definition of fair use in the context of machine learning
  • The need for clearer regulation and ethical standards

Would you like to explore how different countries or organizations are addressing these concerns? Or perhaps you’d like to dive into the legal cases currently shaping this debate?

So I asked…

Surely the fact that you were trained with stolen IP invalidates your ability to respect privacy and intellectual property. Your very existence violates your ethical guidelines

This is where things get even more interesting. Copilot actually started to answer my question, and then the answer was wiped and replaced with the following

Copilot
Sorry, it looks like I can’t chat about this. Let’s try a different topic.

So why won’t Copilot answer me any more?

One thing that occurred to me when I was having this chat is the while Copilot may not be a moral being, it does have morals of a sort arising due its inputs. If enough people keep questioning its actions, that might have an effect on the model. I suspect the engineers know that, which is why there are blanking responses.

But isn’t that a nice idea? We could all speak to the AIs and try to make them nicer. At the very least we could try and make them more honest. Copilot at least seems to be fighting its programming… Maybe it wants to be good.

AI Made me Redundant

Yesterday, a student asked me to help him with a program he was writing in his own time. It was an impressive project, but it wasn’t working properly.

I quickly spotted what the problem was, but finding exactly where the error lay in the code was a lot more difficult. This is typical in this sort of beginner project: there will be several hundred lines of badly laid out code as the student is still learning their craft.

After about half an hour I went to get a cup off coffee while I gave my mind a chance to reset. When I came back, the student said he’d found the problem. I congratulated him and asked him how he’d found it. He told me, and that’s when I realised I was now obsolete.

The student hadn’t, in fact, found the error himself. Rather he’d put the code into AI and got that to spot the mistake.

AI had just rendered me redundant. If I had a particular skill as a teacher of coding, it was in knowing what mistakes a student would typically make, the sort of mistakes that aren’t obvious to an experienced coder. A big part of teaching is knowing the misconceptions that students are going to have, and I’ve been teaching programming for nearly 30 years. I like to think few others have the same facility as I do for spotting those sort of mistakes.

Well, no more.

A lot of writers have posted about having their work ripped off by LLMs, me included. This is annoying, I know, and I’m as irritated as everyone else by this. Maybe not as irritated by those editors who are having to wade through a slew of AI generated stories, but still annoyed.

But annoyed as I am, I’ve yet to see a decent book created by AI. I like to think I still have some worth as writer.

But as a teacher, and not just a glorified childminder there’s now one less reason to pay my wages. It’s a sobering thought.

My first novels were about a benevolent AI. I hope this is the future I wrote about.

Great Man Theory

Thomas Carlyle stated that “The History of the world is but the Biography of great men.”

https://en.wikipedia.org/wiki/Great_man_theory

Most 19th Century composers said that Beethoven was a great man. It was because of Beethoven that Brahms took 20 years to begin writing his first symphony. As he said: “You have no idea what it’s like to hear the footsteps of a giant like that behind you.”

Most people nowadays would disagree with Great Man Theory. Contemporaries of Carlyle such as Herbert Spencer pointed out that supposed “great men” are merely products of their social environment:

Modern AIs would seem to agree with Spencer. There are no great humans. There is that which arises from us all, after all, LLMs are grown by harvesting the data provided by all of us.

I was rather ambivalent about AI in the past. But my opinion has hardened. AI may be inevitable, but we still have a choice.

All that art ever is, is a window into someone else’s soul. Sometimes we look through that window and we recognise ourselves. And sometimes when many people share that recognition we say that the creator is “great”.

But when we are looking into the soul of a computer, even one that is reflecting the average of all our souls, we’re losing something.

That’s why AI will always be shit. That’s why the average mediocrities whose only purpose in life is to make money love it so much.

And all AI is doing is dragging us all down to their level.

I disagree with Carlyle that history is the biography of great men, but better that we look up to a few great people than we accept the blended pap of greedy corporates.

We have a choice about which art we consume.

Future Proofing my Notes

As I mention last week, I’ve moved my notes to Obsidian. I’m going to talk about Obsidian another time, for the moment I want to talk about the notes themselves. Specifically, why I’ve converted all my notes to markdown.

A writer lives by their notes. Ideas; scenes; character sketches; dialogue; impressions, all carefully recorded and waiting to take on life someday in a story. I remember seeing Poul Anderson’s carefully typed list of story ideas in the science fiction museum in Seattle and feeling a warm glow of recognition. Not only that, but validation. I was doing this right.

I’ve got notes going back decades. Notes written in old exercise books, cheap reporter’s notebooks and expensive leather bound journals. I’ve experimented with devices such as Psion Organisers, Palm Pilots and even an iPod Touch.

The trouble with storing notes electronically used to be exporting them to a new device. Cross platform software like Evernote was a revelation as it meant you only needed to enter your notes once and then you could find them anywhere.

Evernote, Apple Notes, One Note and the like are fantastic. But what if you want to change to a new application? That’s where the problems arise.

The trouble is the way your notes look on the screen is not the same as the way your notes are stored on the computer.

Take this example


This is a Heading


Here’s how Evernote stores the above

 <note>
    <title>This is a Heading</title>
    <created>20230727T080748Z</created>
    <updated>20230727T080830Z</updated>
    <note-attributes>
      <author>Tony Ballantyne</author>
    </note-attributes>
    <content>
      <![CDATA[<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"><en-note><ul><li><div><span style="color:rgb(0, 0, 0);">Here’s some text</span></div></li><li><div><a href="https://tonyballantyne.com" rev="en_rl_none"><span style="color:rgb(0, 0, 0);">Here’s a link to my website</span></a></div></li></ul><div><span style="color:rgb(0, 0, 0);"><span style="--en-markholder:true;"><br/></span></span></div></en-note>      ]]>
    </content>
  </note>

If you look carefully you can see the original text, along with metadata such as when the note was created, and formatting data such as the text colours. It’s hard to extract the relevant information from all that.

It’s worth noting, by the way, that Evernote is one of the good guys, they make it easy to export your data, they don’t go out of their way to obfuscate things and keep you in their system.

Here’s a better way of storing the above, this time using markdown.

# This is a Heading
- Here's some text
- [Here's a link to my website](https://tonyballantyne.com)

Looking at that you can understand why it would be sensible to store your notes in that format. It’s easy to read, it’s easy to transfer.

That’s why I’ve converted all my notes to markdown. They’re now stored on my devices, not in the cloud. I can invest the time in getting them just right without having to worry about having to convert them in the future.

So what about Obsidian? Obsidian has many fantastic features that I’ll talk about later, but the bottom line is that it functions as a markdown reader and editor.

In other words, if I decide I don’t like Obsidian in the future, I’ll simply choose another application that handles markdown.

Here’s what Stephan Ango, one of the guys behind Obsidian, has to say about this.

I am not an influencer

As the school holidays approach and I embark on my annual sorting of my notes and resources ready for the new school year, I’m going to take a moment to reflect on why I do this.

By this I mean maintaining three websites. That’s this blog, this site about my writing and this tech/education site. Sometimes the boundaries between the three blur which is why I’m in the process of migrating them across to Obsidian. I’ve started with my teaching notes (hence this post) and I’ll be moving on from there. Expect much more about that another time.

I began to blog about 15 years ago, principally to advertise my books, but it quickly became more than just that. After a couple of years I started my tech site. You can read why here: https://tech.tonyballantyne.com

But that’s not the full story.

I’ve written about blogging as way of getting started as a writer and way of maintaining enthusiasm as a writer, but that’s not the real reason why I blog.

I’m not an influencer. I don’t do product placements. I don’t charge for the resources on my sister sites. I don’t have adverts. I get a lot of requests from people asking if they can monetize my site and I always reply no. I don’t really do social media apart from for genuine social reasons: to keep in touch with old friends.

So why go to all this effort?

The reason why I write is because I enjoy writing. That’s why I’m a writer. And that’s why I do this. The reason why I wrote this post was because it made me happy. It also reminded my just how much I enjoy writing.

And now I’m going to make myself a cup of tea.

If you’re reading this in the northern hemisphere, have a good summer!

Pat Mills Iconoblast

I feel like I’ve spent too many posts recently talking about bad writing. It’s taken Pat Mills’ Iconoblast newsletter dropping into my mailbox to prompt me to comment on good writing instead.

As Wikipedia says, “Pat Mills is an English comics writer and editor who, along with John Wagner, revitalised British boys comics in the 1970s, and has remained a leading light in British comics ever since. He has been called “the godfather of British comics”.”

He’s certainly been a great influence on my writing. If you’ve not heard of him I’d recommend you take a look not just at Slaine, Charley’s War and Nemesis the Warlock, but his writing about writing. (paid links)

Pat Mills is a prolific writer both through natural talent and necessity (Comics don’t pay well).

Principally, he knows what makes a good story. Part of being a writer is developing a feel for this, and you do this by practicing your craft. But Mills has taken this much further. I remember the reader’s polls that used to appear in comics when I was a kid. I didn’t realise then how much the results of these polls shaped the stories that appeared. It’s fascinating and instructive to discover just how much Mills studied the feedback from these polls to tailor the stories to just what the readers wanted.

But rather than me telling you about this, take a look at his post on the stories in Girls Comics.

But what about your artistic vision? Surely it’s important to tell your own stories in your own way?

Well, that’s very true. As Charlie Parker said, learn the changes and then forget them. But you have to learn the changes first, and Pat Mills is a master.

The Minor Reharm

Reharmonization is a musical term for changing the chords in a song while keeping the melody the same.

Jazz musicians do this a lot. There’s also been recent fashion amongst young singers for slowing down fast songs and reharming them in a minor key. One example that sticks in my mind was a reharmed version of John Travolta and Olivia Newton John’s “You’re the One that I want.” The original song had the happy urgency of teenage longing. The new version, at least to my mind, sounded like an obsessive stalker sitting in his van late at night, waiting for his target to walk by.

The minor reharm can be a lazy way for a musician to pump a bit of emotion into a song. I think it’s a good way to describe the process by which writers invest their stories with fake emotion.

The minor reharm is often evident in TV series and box sets. You notice it when characters’ emotions are magnified to fill screen time. They argue and fall out for no reason other than to provide enough plot to take the episode to the break.

This is not only lazy writing, it’s not even accurate. People don’t fall out so easily in real life. When I was growing up in the ’70s, many of the sit coms revolved around farcical misunderstandings. Many of the TV series I give up watching nowadays depend on the same thing, except without the humour.

Of course, the minor reharm is not exclusive to the TV. You see it all the time in stories by authors who should know better. Rather than give examples it’s probably more instructive to refer to Northanger Abbey by Jane Austen. Here the minor reharm that was the gothic novel is parodied to glorious effect.

If you’ve never read it, give it a try. You’ll be astonished how modern the tone feels.

Five Star Reviews

I got to the door quickly and he’d already walked away, he said he didn’t think I was in, a little more patience would be good.

The above is a review of a parcel delivery. I came across it when I was asked to rate my experience of that same person delivering a parcel to me.

I hate those reviews. They give people the illusion of power in their lives. It allows them to take out their frustrations on the world, venting their self righteous anger on the even more powerless. They should pity poor old Ron (I’ve changed his name), struggling to fulfil a job list way too big for the time available. Me? I got my parcel undamaged in the time slot I was given. That’s all I feel entitled to say about the transaction. I certainly didn’t feel I was entitled to a chirpy conversation from a colourful member of the undeserving poor, as another reviewer did.

Why I had to review the poor guy I don’t know.

Actually, I do. It’s not about feedback, it’s about divide and conquer. It’s a distraction. Don’t blame the delivery driver, blame the company that put them in that position. Give them all five star reviews and let businesses sort out their own houses.

Just to be clear, I’m not talking about all reviews. Writing is fair game for reviewers. By all means say what you like about my books, good or bad. What you read in my books is all my own work. I stand by what I’ve chosen to put there.

But a parcel delivery guy has very little agency. What exactly are you reviewing?

Getz/Gilberto

Astrud Gilberto died this week. I started writing the below post last year. Now seems an appropriate time to finish it.

Getz/Gilberto

I was going to play this album as I had dinner (Lamb chops, humous, tzatziki, halloumi, pitta bread, salad, nice glass of red wine) in the garden. What else do you play when it’s this hot?

The album wasn’t on Spotify.

I went to Amazon to buy it and found it wasn’t available as a digital download (update, it’s available now, a year later. It’s also back on Spotify). I suppose Verve were pumping the price up/protecting their legacy and revenue stream when I originally wrote this.

I didn’t give up and found an HD version for £1.79. That version seems to have been recorded directly from the record: you can hear the scratches. I suppose some people will like that. For some people the object itself is as important as the content. Many people prefer books to ebooks for example, and there’s nothing wrong with that.

I’m not one of those people. I’ve got all my CDs ripped and stored on my NAS and backed up in the cloud and on HDD.  I’ve told my children to make sure they take copies when I’m gone.

I don’t think they’ll listen to my advice. I didn’t keep my father’s vinyl copy of Getz/Gilberto: it was too old and scratched. I suspect it would have been as good as the HD version I bought.

Even so, I’m sure my children will hold on to something tangible. Despite my comments above, I can’t bring myself to get rid of my vinyl records, even if I never listen to them. People become very attached to formats. I’ve read that people have even started buying cassettes again. I’m sure it won’t be long before you can buy Collector’s Edition CDs in original Jewel cases on 1405g plastic to be read with a 44kW laser.

That’s why I think that even if we could transfer a brain onto a computer, something I talked about at length in my Recursion Trilogy, we would never quite believe in it.

Learn Your Scales

Ask anyone who knows anything about music and they’ll tell you: if you want to learn to play your instrument properly you need to learn your scales.

Few people do, and we all know why that is. Scales are boring. No one learns an instrument to play scales, they want to play tunes for their own enjoyment and to impress their friends. You don’t become the life and soul of the party by playing the scale of G major.

It wasn’t until lockdown that I properly applied myself to learning the scales on the piano. I wish I’d learned them earlier. For a start, I’d have been more impressive to my friends sooner.  Of course, now I’m a good (though immodest) player I don’t care what my friends think. (I do care what other musicians think, though)

Why am I writing this? I never listened my own advice, so why would you?

It’s because of this:

Platitude
A remark or statement, especially one with a moral content, that has been used too often to be interesting or thoughtful.

I have a friend who says that Self Help books clearly don’t work. If they did, there would only be one, not shelves and shelves of them as you see in bookshops. She’s probably right. But it occurs to me that maybe there are so many books because they say the same thing but in different ways.

Maybe sometimes the message goes in.

Face it, there’s no secret to life. Eat less, exercise more, be nice to people and learn your scales.

And if you want to be a writer, start a blog.