Friday, February 19, 2021

Ultimate ADOM: Loading & Saving (Technical Clarification)

Load & Save is coming soon. If we hadn’t stumbled upon a serious internal bug it would have been part of the EA release. We are working on it and expect to provide ASAP.

 

So what happened? Why are we going to EA without such an important feature, knowing full well people usually don't like leaving their computers running for hours?

 

It's quite simple: It's entirely our fault. The first is that we've been overly optimistic about our ability to get Saving and Loading into the game in a way that works instead of most often placing you into a mirror universe of the one you left when saving, where things roughly look the same but don't, and bugs will murder, corrupt and crash you, and not in a good way. In short: The solution we thought would work didn't quite, and we feel that the frustration of trying to load a game that would then crash around you would have been much higher than not being able to save and load at all.

 

Ultimate ADOM: Caverns of Chaos is still an Early Access game, and as such there are some core features missing. We have recently published a roadmap where we outline where the next big features such as Hunger and Corruption (in April), better combat options and AI (in June) or enemy spellcasters (in August) will make it into the game.

 

Your feedback, both positive and negative, is very valuable to us and very welcome.

 

In the next few days, we are focusing nearly exclusively to getting Saving and Loading into the game in a way that works for everyone. Thank you for your patience!

  

Here's a technical explanation for those of you in the know or with the interest about what went wrong with our initial estimate.

 


Data in Ultimate ADOM is insanely complex.


This now is going to get a bit technical but I would like to explain what happened so that everything is clear and in the open: While we have 20+ years of experience doing Java development and the many years of C programming ADOM,  the C# and Unity used in Ultimate ADOM brings its own challenges with it. Sometimes these are unexpected. The dangerous thing we noticed early on is that – while C# and Java look very similar on the surface – the actual ecosystems are vastly different, especially in the design approach to the programming languages and the open source ecosystem surrounding these two worlds.

 

My personal opinion is that both languages are fantastic to work with (and these days I even prefer C# to Java although Java has been my big love over more than 20 years [I started with Java in 1998]). But strangely both languages have areas where the design and engineering behind the languages are miles ahead of the contestant and you really wonder “how can this happen with all these brilliant people working on language design”. And Java has the far better open source ecosystem with far more advanced (and tried and tested) solutions to complex issues.

 

But back to our game:


Ultimate ADOM has a very complex data structure because we really try to simulate a highly detailed fantasy environment. Our goal is to implement a game engine that can scale up the complexity to insane levels, all in the name of fun. Our target is more the micro level of the game (e.g. the individual character, the items carried, the murals on walls, the slippery fluid on the ground, the genetic DNA of your hand) because we want to have crazy features often mentioned in the ramp-up of the game release like grafting (e.g. attaching a dragon head to your body giving you the ability to bite and breathe fire), animancy (e.g. animating your trusted sword and turning it into a companion that follows you through the dungeon… or doing the same with an altar or a dungeon wall) and complex elemental magic and related effects (weapons becoming damaged from hacking at walls, an extensive liquid system in order to have pools of healing liquid or rivers of confusion liquid). And so on and so forth. To be able to do this we built a system based on a technology called ECS (entity-component-system) and used that for everything in the game. For absolutely everything.

 

So e.g. the player character is an entity that in turn consists of a skeleton (an entity) with more than 15 individual body parts (again entities), has a race and profession (all entities) and lots of equipment (all entities, some of that attached to body parts, others in your backpack, again all entities). Items among other things are made from certain materials (defined as entities) and weapons/spells/etc. cause damage (again defined as entities). If you do a text dump (in an internal JSON-like format) of just a freshly created player character that text dump will have more than 700,000 (!) lines of text. Yes, there is a lot of redundancy in that but the nested complexity of the entities and components used to model the game world is quite astonishing (even to us). Especially as performance is pretty good with this although we yet have to do a lot of performance tuning during EA.

 

Is this complexity really necessary? I have pondered this question long and often and still will answer it with a resounding “yes” given our target of creating the most intricate and complex (in a fun way) roguelike ever.

 

So what about saving and loading?

 

Because we underestimated the complexity of saving and loading this huge set of data in the C# ecosystem. We had done very early prototypes in Java about three to four years ago when we first decided whether to embark on this journey and not. We used simple and plain Java serialization to test loading and saving data. And it performed splendidly. (To explain: serialization is a feature of many languages used to persist data to a hard disk or some other kind of storage and restore it from there – it usually saves you the pain of doing everything by hand at the cost of some performance). Due to the complexity of our data we very much planned to not handle everything on our own but to rely on trusted frameworks developed by other amazing engineers. And for the Java test everything worked well.

 

Then we checked off this issue and basically ignored it for a long while. The original plan was to have a fully implemented load & save feature by Christmas 2020 and we had set aside a week of implementing and testing it because we had managed to implement it three years earlier in half a day in Java.

 

Then we tried it in C# and had to learn it simply didn’t work. At all. Beyond any hope. The only inbuild C# framework we got to work is the BinaryFormatter. If you ignore for a moment that Microsoft  itself says “Don’t use the BinaryFormatter – it’s a security disaster waiting to happen” (and still keeps it in the framework) it was the only framework that managed to save and load our game. But I’m up to this day not 100% sure if it truly worked because even for a freshly started game it created a save file with more than 100 megabytes of size which took more than 10 minutes to save and more than 40 (!) minutes to load.

 

You can imagine our shock and disbelief after not having seen any such issues with earlier Java tests. After recovering from stun and shock we frantically went through other C# serialization frameworks (https://aloiskraus.wordpress.com/2017/04/23/the-definitive-serialization-performance-guide/ has a great list and compares them for performance). And none of them worked for us. Some showed critical bugs when encountering our huge data structures, some were not correctly able to handle all private data and would have forced us to completely wreck a well-designed architecture (which would have taken weeks – at this stage Ultimate ADOM has more than 500000 lines of code and comments in more than 3900 classes) with uncertain outcome. And each framework had different requirements on how to change the architecture. So we tested about half a dozen of frameworks that required only mild changes and each and everyone failed. Some had unexplained crashes, others were much worse at performance for our case than you would have expected, etc.

 

At that point we already had spent more than 20 days working on nothing but this issue. 20 days that now were missing from our release plan. 

 

So we made the final and hard decision: We are going to do loading and saving ourselves and code everything by hand. From ADOM I knew how staggering this task is because you then have to go into each and every bit of data you have and save and load every single byte in the correct order. And not forget to handle a single important task. So we knew this task also was mind-blowing. My colleague and friend Jochen Terstiege (who has more than 15 years of ADOM experience) took the brunt of the task and the rest of the team supported him as best as we could. And we managed to have a working version of load & save about nine days before release. At the cost of another 10-12 days of working on nothing but load & save, going through more than a thousand classes to implement our own algorithm.

 

First results were great: file size was reduced from 100 megabytes to about 5 megabytes (and there is room for optimization), saving now takes 3-5 seconds at most and loading roughly 2-3 seconds (this will increase for very late game stages because there is more data to handle) but overall the results were blazingly good.

 

So we dared to breathe a sigh of relief, full well knowing that this problem had caused us to lose a total of a month planned for finalizing and polishing features before release. 

 

And then about four days before release disaster again struck: We found a serious problem that ruins save files. And we know we now had lost. No way to share this load & save feature for EA because destroying save files is even worse than not being able to save stuff.

 

So we made the decision to not have load & save in the EA release and struggle ahead. That is where we are standing right now and why we decided to disable it just before the launch. My 20+ years of experience as a programmer and architect and all my ADOM/JADE/Java experience I felt right in assuming that load & save wouldn’t be the huge issue it has become, but sometimes it doesn't work out like that.

 

My optimistic side tells me we might have it by beginning of next week, my pessimistic side says “end of next week” but to be honest I really do not want to give a promise here because we already misjudged once. We are working on a fix and we can see it on the horizon. Now it is a matter of focus and concentration. That’s why we probably also could have communicated much better, and hope this clears up the situation!

 

In summary:

  • We underestimated the complexity of saving and loading.
  • We ask you to continue supporting us and a little bit more patience while we finally get this done and into your hands.
  • Please test and play the game and give us all your other feedback. ADOM was built on the creativity and support of a most wonderful community and we want to continue this tried and tested approach to deliver the most complex roguelike ever (meaning fun complexity).

And please consider that we are a tiny Indie team. At best 2.5 programmers are working on this game (plus 2.2 graphical artists and one sound/music designer). And that’s it.

 

I am very sure that in a couple of weeks the then current EA version will look very different from now and be a lot more complete but we can’t change the present.

 

That being said the team already has holed up and dooes its best to fix the load & save issue (and whatever else you report). And provide new releases ASAP. We are 150% committed to Ultimate ADOM and everyone in the team gave its best to deliver a great experience. But sometimes it’s the small stones that cause you to stumble.

 

Thanks for listening. I hope to be back very soon with an announcement about a release containing load & save ;-) We will keep you updated!

 

Thomas Biskup

No comments:

Post a Comment