Show full content
Following an earlier mostly successful experiment of vibe coding to build a simple JavaScript game, I decided to take it to the next level: see if I can use ChatGPT to assist me in discovering a new and improved algorithm for reversing JavaScript’s Math.rand( ) function. Specifically the research showed a new way to invert Xorshift128+, and a way to translate that into a full inversion of Math.random( ). There still needs to be more optimisations to make it practical (work in progress), but it is a good start.
Although this was a new result, what I found is my usage of ChatGPT in this case resulted in a decrease in productivity. Despite that, I will continue to use it for research purposes because now I have a much better idea on how to use it. The purpose of this blog is to pass on these insights to others so they can similarly maximise their benefit of such tools.
More specifically, this blog is about the value I see today in using ChatGPT to assist in research and invention. I used the free version which is ChatGPT-4o. The unique contribution of this blog is a list of DOs and DON’Ts for using ChatGPT to assist in research. I need to be clear that the scope of my investigation is only ChatGPT: I make no promises about other AI tools. The full details and justification are in the body of the blog, but the summary is as follows:
- DO use ChatGPT to assist you in your research, but keep it on a tight leash (i.e. don’t let it do too much at once)
- DO use ChatGPT to validate your ideas
- DO use ChatGPT to get feedback on better ways to communicate your ideas
- DO use ChatGPT to proofread your writeup
- DON’T let ChatGPT take the lead on your research — it will err too many times
- DON’T invest too much time into exploring CHATGPT’s ideas because many will not work (and yes, it does come up with clever ideas that look plausible)
- DON’T let ChatGPT assist you in debugging code (or at least restrict it), it will lead to rabbit holes and more problems
- DON’T let ChatGPT write up your research for you — while it does an excellent job of communication, people generally dislike and distrust content that someone else got directly from a bot
Related work: I am obviously not the first to explore AI as a researcher, for example see these links from renowned mathematical researchers link 1, link 2, link 3, link 4. A common theme in these links is likening it to a junior research assistant. I agree with that.
This thing can really “think” and innovate like a human researcherHold off in calling me an idiot and actually look at what came out of ChatGPT. You really have to see it to believe it.
The full chat history is here (link) but it is long and it will take a minute to load in your browser. There are a lot of examples in there of it trying to invent solutions. I am just going to show you one of its ideas that I thought was worth trying but eventually I abandoned it (reasons for that to be explained in the next section).
There are two prompts that I gave it that hinted where I wanted to go, but it ran with a solution that I never imagined on my own. Here is the first prompt:

The second prompt starts with my “Let’s think this through” prompt. Screenshot below:

It ran wild with its idea to solve the problem by bringing in carry bits as unknowns into a set of linear equations and deriving solutions to the unknowns including carry bits by induction and linear algebra. I emphasise that I never would have brought carry bits as unknowns into the linear equations, that was its idea. Here you can see ChatGPT’s attempted solution to this problem:

I couldn’t fit it all into one screenshot, so here is the next part as a second screenshot:

I looked at this with amazement, but also a bit uncertainty. Is this really going to work? It sounds like it is worth considering. So I asked myself, what is the best way to verify it? I could verify the logic, but it also was sketching an implementation and offering to give me one. As a first step in deciding whether it was right, running an implementation was a lot quicker for me than checking the logic, so I took the easiest path. If that works, the next step was to check the logic. I asked it for a C implementation to test its idea and right away it produced one. Wow, cracking Xorshift128+ is going to be a lot easier than I expected!
How wrong I was! Instead, this is where the rabbit holes began and a lot of productivity was lost. But whether its ideas work or not is separate from the statement that it can come up with plausible ideas on its own, which it did. Sure, I pointed it in one direction, but the solution that it attempted was entirely its own.
Here’s a second example where ChatGPT was convincing me to add some pruning function. That idea also turned out to be unsuccessful, as you will see later in the blog, but it was again its idea. This example involved a few discussions back and forth including me pasting in my code and telling it that I am not convinced that it will add value. Its response changed my opinion.

Still think I’m an idiot for saying it can think like a human? Maybe I am, but I am in good company.
Lesson learned: Keep the bot on a tight leash, and other stuff you should not let it doAs I said above, the easiest path forward for me was to just have it produce the program to invert Xorshift128+, so I asked it to. But the program it produced above was buggy. Of course the bot was very eager to help me debug it. This is where one problem led to the next.
One of the early problems was that it produced an implementation that was highly optimised, packing bits into words so it can do multiple parallel bit operations at time. This is not a good approach when you are testing an idea: the first thing you want to do is make code that is obvious to understand and trivial to debug. Optimisation is a later step that you do once the basic code is working. This is something one learns from years of experience of scientific programming. I told it this in the chat log: “I must emphasize that when we are prototyping an idea, we care more that it is so obviously correct than about how fast it is. Once we know it works, it is trivial to make it faster later.”
But it got a lot worse during the debugging session as it kept trying to change the code and even the underlying data structures of what we were trying to debug. I had to scold it several times that this is not how you debug software. I eventually laid down my ultimatum:

On top of that, there was the hallucinations. It started imagining code that never existed. Lesson learned: be very cautious about letting it help you debug code.
All up, the problem was letting the bot to do too much at once. While I was wholly impressed at the tool inventing solutions and attempting to program them, it is just not mature enough to do that much at once. In the future I can imagine that these bots are using other tools to cross-check themselves, and I can definitely believe a future where they replace human researchers. But it’s not there yet.
The main lesson learned is to keep the bot on a short leash. Instead of letting it try to do much of the work for you, take it one piece at a time and check each piece as you go. If it’s going to offer ideas then validate them before you pursue them. Ultimately it is important for you to remember that you are the boss and the leader, and the tool is your junior assistant. At least for now, no bets for the future!
As it is the junior assistant, the other lesson learned is not to invest too much time on its ideas. In my experience, most of them failed. There was one case where it really seemed to have a good idea for pruning the search space that I looked into because I really thought it would help. I later found that it did not help because the way I constructed the search made it so that none of the pruning cases could ever happen, implying that checking for it with a hope of an early abort was actually slowing things down. When I discovered it and explained it to the ChatGPT, it acknowledged it:

Whereas the previous section was about what it didn’t do well at, this one is about what it did well.
There were so many times where I had an idea and I wanted to run it by someone, or maybe I should say some thing. The bot was definitely my friend here. It did an excellent job at understanding my concepts and writing them back to me better than I had communicated them.
Here is one example where I told ChatGPT my idea for the algorithm that eventually led to the main research result.

It’s always good to be sure that your ideas make sense. I was happy for it to confirm my ideas:

I must emphasise that the tool will not always agree with you just for the sake of agreeing. It will tell you when there are logical failures. There was one case where it was wrong in claiming that I made a logical failure, but it led to me describing it better and it eventually agreeing. Even in that case I found value in the discussion, which helped me explain it better.
Lesson learned: Use the tool to get feedback on better ways to communicate your ideasGenerally, ChatGPT would always summarise my ideas in a nicer way than I originally communicated them, and that helped me think more clearly.
But there’s more to it than that. For example, when I was looking at the original Xorshift128+ implementation, I was bothered by the variable naming and found it very confusing to use in my research. I asked the bot if it could suggest a renaming of variables that made it more friendly to humans. It did so:

We had a few more iterations before we settled on some naming of variables going forward. This helped me a lot.
Lesson learned: Do ask ChatGPT to proofread your writeup (but don’t let it re-write it for you)If you read my previous blog on Xorshift128+, you would see that I stated that the original notation was confusing and I rewrote the whole blog to use an easier-to-read notation:

This made me nervous because I had spent a large amount of time writing it up originally and I was scared to make changes and get it wrong. Thankfully, ChatGPT checked my work.
This was actually really impressive. I had to cut-and-paste formatted text into ChatGPT where the formatting was lost in the paste. For example, 226 would show up as 226. I told the bot that that might happen and to keep in mind there could be formatting errors, and It mostly understood that.
As I was getting it to review my writeup, it offered to write several parts better. There is no doubt about it, ChatGPT communicated the ideas better than me. I acknowledged to the bot that it is a better writer than me but I want to keep it in my voice. I think many people get this, but for those who don’t, nobody wants to read someone else’s AI generated content. Keep it in your voice.
One of the great catches from the bot was noticing that I used the word “not” where I intended to write “now”, see screenshot below. This little catch made me very happy that I was asking it for help.

AI cynics are everywhere. I used to be among them, but this field is changing rapidly and I expect many people will be surprised to see how powerful the tools are becoming and what they can do. Obviously if I hold a blade end of a chainsaw to try to cut down a tree, I’m not going to end up with a positive result. Like any tool, one needs to know how to use it properly. Over time, such tools will become more helpful as the technology improves. In the meantime, I think researchers are doing themselves a disfavour if they do not bring AI assistance into their tool belt.




















