📝

Let the ai test it’s own code (3 sept 2025) (Gone Wrong!? I created an infinite loop!)

A colleague of mine recently told me he likes to have automated testing inside of his projects that he uses an ai coding “companion” with. If I recall correctly, he prefers using Cursor and Claude, but that’s besides the point.
The idea of automated testing and having the ai improve on it’s own was fascinating to me. Today I had a go at it, since 2 days ago I created a docker container in which Codex could run without any guardrails.
My first proper attempt at telling it to run tests and improve code based on the tests resulted in this:
You now have full access, can you run the tests and self-improve?
notion image
a 3600+ seconds prompt, that’s 60minutes, also known as an hour.
 
I’m not even quite certain what it’s doing. It seems to have created some more tests, and improved some tests and the code that is being tested, but not by any crazy amounts. It also takes a very, very long time to complete “npm run -s coverage”. When I run it locally myself it takes ~90 seconds.
 
In the 60 minutes that it ran, it has managed to complete npm run coverage a full 8 times. This does mean, that in theory it should have ran 8 cycles of improvement. I do wonder how long this could go for, for the sake of an experiment I should maybe try this sometime.
 
But for now I’m ending this loop, as it’s hogging my assistant.
 
I found that for some reason it had created a backup of my node modules, twice. It created one in my root, and in the coverage folder. This introduced tens of thousands of files that now had to be scanned for the test coverage as well. This made it slow to say the least.
 
Time for another attempt at this, now with a normal amount of files.
I gave it a similar prompt Please run cleanup once. Then run the tests and self improve
notion image
 
It gave a normal, one time loop result. Sure.
Another attempt, trying to get it in a loop again.
I would like you to improve code coverage, run tests, improve based on the tests, run tests, fix anything that needs to be fixed, improve code coverage and repeat this cycle again
Again, this gave a normal output.
notion image
 
I guess my one time massive loop was a special occasion. Perhaps there are some other ways to get it in there, and I could probably specify it to do n amount of loops. The idea of this is quite fun if you have some time left over where you want to just give it a prompt and walk away for an hour, doing some chores.
 
Let the ai test it’s own code (3 sept 2025) (Gone Wrong!? I created an infinite loop!) - Sam van Noord | Sam van Noord