The tooling only uses zlib because that was easiest. I didn't want to bake in a zstd dependency into the nim project.
Extracting the raw resources from the clientside sqlite database and using those to seed the nwsync repo sounds _extremely_ cumbersome and backwards to me, even though the blob file format is the same because I like reusing code (NO WARRANTY ETC). If you are concerned about storage, use zfs/a compressing file system. If you are really worried about transfer, store uncompressed and use http stream compression.
I experimented a fair bit with zstd compression levels and it turns out, for NWN content (CEP in this instance), the default compression level was the best tradeoff. Increasing it resulted in MUCH longer compression times with absolutely negligible gains (<5MB per GB). I'm sure there's some wigggle room with doing dictionaries but those really only show some gains for lots of GFF data, which the average hak set doesn't contain. There would also be serious gains by chunking individual files together into compression blocks (like a zip file) because that allows zstd or zlib to sample more data, but that adds a - imo - unreasonable amount of complexity and brittleness; also immediate seeking goes out the window which would hurt game loading performance.
The game side of things already saves and compresses files in a threadpool, so if your CPU and IO can stomach it, it will be barely noticeable with the current chosen defaults.
It's the transfer length to clients that I'm primarily worried about.
What led you to go with zlib over zstd for the tool?
What are the drawbacks to embedding zstd in the tool along with the client?
And unfortunately, http compression streaming isn't really a viable solution, since it just uses the same deflate algorithm that zlib does (or gzip, which is just deflate with fancy headers), which would likely add space rather than reducing it.
zlib over std: Because nim already had zlib built into the distro; zstd would require to compile it on all supported platforms and ship it alongside. It's doable, just not high priority work.
Yes, true, http stream compression is just zstream too, but again, the main goal here was simplicity. I daresay for 99% of users, the main time sink in a sync process will not actually be network speed.
One of the major improvements I would actually like to put in is support for directly indexing into hak files, instead of exploding resources into a hash tree. That won't get you any server side compression either, but maybe will hint at why I am thinking about on the fly stream compression.
How possible would it be to fork the current build ourselves with zstd? If we were to embed zstd libraries into the tool, would the client be able to handle it seamlessly? I ask this in complete ignorance, since while I can follow the code (somewhat), I don't even know which language it's in, let alone anything else. lol
Just trying to figure out what's possible before I task a programmer with it.
What do you do to uncompress the data (lines 134 to 141)? Everything I have put the compressed data through complains about it not being valid Z/Zip/Gzip/Zstd format.
@TheCapulet Requires adding the zstd library to the package and writing the necessary code to support targeting it. Why are you so intent on zstd for the serverside repository?
"NSYC" # magic bytes
uint32_t # nsyc version (current is 3)
uint32_t # 0=none, 1=zlib, 2=zstd
uint32_t # uncompressed payload size (since compression streams dont usually contain that info)
After this header, the compression payload follows. For zlib, it is:
uint32_t # version (currently 1)
For zstd, it is:
uint32_t # version (currently 1)
uint32_t # dictionary (0 = none, currently unused and just a placeholder to support seeding zstd dictionaries in the game distro)
After this, the raw data stream follows. For zlib, it is ZSTREAM (i.e. no gzip header). For zstd, it is simply using the compression methods that write out a single frame (i.e. ZSTD_compressXXX).
The storage format inside sqlite is the same as the one described above, but again - no warranties of any kind it'll ever remain that way.
You can use your webserver mechanisms to throttle client transfer rates. Look up "ratelimit" in your web server documentation. It'd be best if you picked a limiter based on per-ip buckets that store bytes transferred, instead of requests done. Make sure your webserver only delays responses, and doesn't actually return 50x messages, as the game will error out the transfer on that.
The other approach would be to use a kernel-level traffic shaper. Those will be even more efficient, and simpler in many ways, but requires some deeper understanding of how networking works, no matter the host OS.
@TheCapulet Requires adding the zstd library to the package and writing the necessary code to support targeting it. Why are you so intent on zstd for the serverside repository?
Here's a good example:
I think you take your fancy European internet speeds for granted niv
For instance, at the 4g average speed here in the US, that's a difference of 17 minutes of download time.
Thats a BIG difference for new players, who will likely just hit cancel and move on to the next server otherwise.
And that's a 4g average. My community has players from all over the world, and some of them have drastically slower internet than even I do. (I average around half a meg download, half of what I quoted above).
At the end of the day, I'm just looking after the end user experience. The less time they spend looking at a progress bar, the better.
As figured, zstd does not have enough wiggle room here to throw it's weight around. Two (three) issues:
- We are compressing each file individually for obvious reasons. zstd performs slightly worse at compressing many small payloads than zlib.
- No shared header/no cross-file compression such as deduplication.
- No dictionary usage yet.
We can solve #3 and that might help some, but in the end you will never see the levels of compression you can get with deflating the whole file instead of each resref individually. This is just the tradeoff we pay for getting the deduplication/flexibility.
I will still be adding zstd only for point 3 - the dictionaries are pretty neat - but don't expect huge gains. Just using zstd instead of zlib actually increases storage cost slightly without them.
Looking into this for the server i host for upcoming console/iOS versions. What is the current max file size for NWSync? Currently have Haks that are hundreds of MB
There is no max hak size, but for the console/iOS you should keep in mind how much storage the players have available there. There is a max single resource file size (think single texture, not whole hak), which would be somewhere between 10 and 15mb. 4k DDS textures are okay, 8k is problematic. For TGAs, 2k should be the limit. Other than a couple of broken community models, there's nothing else that can breach this limit.
Are there any plans to simplify or automate setup of NWSync? For those of us who are not coders or web developers for fun or for work, it is very daunting when the first step of the highly-technical nwsync guide is to "setup an http server which is beyond the scope of this guide."
I know that the original NWN losing support from all "official" sources has resulted in many PWs wanting to manage everything themselves for that day in the very-distant future that Beamdog goes away and NWNEE will no longer be supported either. But assuming nwsync will not get easier to setup, and assuming that Beamdog is years away from dropping NWNEE support, I'd like to appeal to @JuliusBorisov and the other Beamdog powers that be who may read this to bring it up internally and consider running a Beamdog-hosted nwsync solution.
It would be a great help for 1) the PWs that can't/don't have the technical expertise on staff to setup their own http servers, or 2) the PWs on cloud platforms whose owners can't afford the GBs of data egress charges NWSync will produce with every player who connects to the server. For a business justification, I would argue that it will be a poor experience for the now-high-priority console gamers who will be utterly unable to connect to the large number of PWs that aren't always at the top of the multiplayer list when some (most?) of them do not have NWSync setup.
Yep more detailed tutorial/video would be helpful. Tried getting this up and running and got lost pretty quickly.
I have the PW files uploaded to my "web server" (via bluehost - It's an https address. Is this still unsupported?). What exactly do we do with the github/binary files? Place them in the same directory as the PW files on the server?
I host the PW with nwserver.exe on a spare PC so i'm not too familiar working with a server when it comes to hosting games and what not.
@Balanor@Grizzled_Dwarflord@jpsweeney94
It's no video, but I've laid out a complete walkthrough for using nwsync - This doesn't include it's whole potential, but it covers everything needed for getting you up and running.
Bill the Dragon,
That is most glorious. I appreciate that you delved into the making/hosting of an http server. I think, for the most part, that's where people get hung up. Perhaps jumping on the NWVault Discord for some specific help on that might be beneficial.
@WilliamDraco This was a huge help to me. Setting up the scary web server was as simple as could be, and your GUI made it very easy to create the manifest files. I was able to setup a test server using nwsync in about an hour using this guide, and WD was very helpful on Discord with my newbie questions.
I still wish Beamdog would consider a simpler approach to this whole process (even putting a field for nwsync url on the Windows NWServer GUI would be nice step), but I realize it's unlikely to happen with so many other priorities. So this more user-friendly guide is a welcome help. Thanks very much, Mr. Draco!
Thanks for the guide @WilliamDraco ! Currently letting the "NWSync Write" do it's thing. Is it normal for it take a long time on larger PW files??
I've been letting it run for 12+ hours and the "nwsyncdata" folder is only 400mb+ in size. Total size for all the PW server files are around 2gb.
@sknymick hmm, thanks for the info. Odd that its taken me 3x as long just to get 400mb? Running it on an SSD too.
Would running the server at the same time affect it at all? (i run it locally on NWServer. So same PC the NWSync process is running on)
I think it very likely would affect it, and I don't know how much. All my hdd's have platters, but the server we host from uses an SSD as well, and it took the aforementioned couple of hours. Though, I don't think he was running the module simultaneously.
You should also anticipate some slowing if you are planning on hosting your NWSync files on the same computer you host the game on. Or if you're using the same network (NWSync uses large chunks of data at a time).
Thanks. Maybe i'll try taking it offline for a day or so.
I plan on hosting the NWSync files on a web server, not the same machine or my network thankfully.
@jpsweeney94 The server I work with has over 10gb of nwsync data and took approx 30 minutes for the first run, with all files on the local computer and on SSDs. Can always check the number of files in the manifest to confirm it is still working, as it should be constantly adding new ones every few seconds as it works.
If it has frozen (The program itself, not the GUI given I've not managed to thread it properly yet), you're totally fine to force-close and re-start the process without doing permanent damage, although depending where in the process it's up to it's likely to have to start-over.
Regarding the actual programs, the only added slowness from the GUI is the log - which you can run with Quiet mode to reduce - but that is minimal. Outside of that, any slowness is down to nwsync itself. Consider things like IO throughput, network limits or potentially if you're running it on a small VM or some-such which might be slowing the process.
@WilliamDraco Odd, so is there anything i can do to speed this up or any idea of why it's moving so slow?? It's slowed down to a crawl. Past hour or 2 has only added another 20ish-mb of data to the folder.
As far as i can tell it is still running and adding files. I can see the time stamps of the folders being updated in the file explorer in the "sha1" folder. Almost every minute more folder dates are changing to the current time.
Edit: Ok, i'll look into those. Don't think any of that applies but will check to be sure.
Comments
The tooling only uses zlib because that was easiest. I didn't want to bake in a zstd dependency into the nim project.
Extracting the raw resources from the clientside sqlite database and using those to seed the nwsync repo sounds _extremely_ cumbersome and backwards to me, even though the blob file format is the same because I like reusing code (NO WARRANTY ETC). If you are concerned about storage, use zfs/a compressing file system. If you are really worried about transfer, store uncompressed and use http stream compression.
I experimented a fair bit with zstd compression levels and it turns out, for NWN content (CEP in this instance), the default compression level was the best tradeoff. Increasing it resulted in MUCH longer compression times with absolutely negligible gains (<5MB per GB). I'm sure there's some wigggle room with doing dictionaries but those really only show some gains for lots of GFF data, which the average hak set doesn't contain. There would also be serious gains by chunking individual files together into compression blocks (like a zip file) because that allows zstd or zlib to sample more data, but that adds a - imo - unreasonable amount of complexity and brittleness; also immediate seeking goes out the window which would hurt game loading performance.
The game side of things already saves and compresses files in a threadpool, so if your CPU and IO can stomach it, it will be barely noticeable with the current chosen defaults.
What led you to go with zlib over zstd for the tool?
What are the drawbacks to embedding zstd in the tool along with the client?
And unfortunately, http compression streaming isn't really a viable solution, since it just uses the same deflate algorithm that zlib does (or gzip, which is just deflate with fancy headers), which would likely add space rather than reducing it.
Yes, true, http stream compression is just zstream too, but again, the main goal here was simplicity. I daresay for 99% of users, the main time sink in a sync process will not actually be network speed.
One of the major improvements I would actually like to put in is support for directly indexing into hak files, instead of exploding resources into a hash tree. That won't get you any server side compression either, but maybe will hint at why I am thinking about on the fly stream compression.
Just trying to figure out what's possible before I task a programmer with it.
What do you do to uncompress the data (lines 134 to 141)? Everything I have put the compressed data through complains about it not being valid Z/Zip/Gzip/Zstd format.
@NotFitForPurpose
The container format has the following header:
After this header, the compression payload follows. For zlib, it is:
For zstd, it is:
After this, the raw data stream follows. For zlib, it is ZSTREAM (i.e. no gzip header). For zstd, it is simply using the compression methods that write out a single frame (i.e. ZSTD_compressXXX).
The storage format inside sqlite is the same as the one described above, but again - no warranties of any kind it'll ever remain that way.
Is there any limit by default?
Thx!
You can use your webserver mechanisms to throttle client transfer rates. Look up "ratelimit" in your web server documentation. It'd be best if you picked a limiter based on per-ip buckets that store bytes transferred, instead of requests done. Make sure your webserver only delays responses, and doesn't actually return 50x messages, as the game will error out the transfer on that.
The other approach would be to use a kernel-level traffic shaper. Those will be even more efficient, and simpler in many ways, but requires some deeper understanding of how networking works, no matter the host OS.
Here's a good example:
I think you take your fancy European internet speeds for granted niv
For instance, at the 4g average speed here in the US, that's a difference of 17 minutes of download time.
Thats a BIG difference for new players, who will likely just hit cancel and move on to the next server otherwise.
And that's a 4g average. My community has players from all over the world, and some of them have drastically slower internet than even I do. (I average around half a meg download, half of what I quoted above).
At the end of the day, I'm just looking after the end user experience. The less time they spend looking at a progress bar, the better.
As figured, zstd does not have enough wiggle room here to throw it's weight around. Two (three) issues:
- We are compressing each file individually for obvious reasons. zstd performs slightly worse at compressing many small payloads than zlib.
- No shared header/no cross-file compression such as deduplication.
- No dictionary usage yet.
We can solve #3 and that might help some, but in the end you will never see the levels of compression you can get with deflating the whole file instead of each resref individually. This is just the tradeoff we pay for getting the deduplication/flexibility.
I will still be adding zstd only for point 3 - the dictionaries are pretty neat - but don't expect huge gains. Just using zstd instead of zlib actually increases storage cost slightly without them.
I know that the original NWN losing support from all "official" sources has resulted in many PWs wanting to manage everything themselves for that day in the very-distant future that Beamdog goes away and NWNEE will no longer be supported either. But assuming nwsync will not get easier to setup, and assuming that Beamdog is years away from dropping NWNEE support, I'd like to appeal to @JuliusBorisov and the other Beamdog powers that be who may read this to bring it up internally and consider running a Beamdog-hosted nwsync solution.
It would be a great help for 1) the PWs that can't/don't have the technical expertise on staff to setup their own http servers, or 2) the PWs on cloud platforms whose owners can't afford the GBs of data egress charges NWSync will produce with every player who connects to the server. For a business justification, I would argue that it will be a poor experience for the now-high-priority console gamers who will be utterly unable to connect to the large number of PWs that aren't always at the top of the multiplayer list when some (most?) of them do not have NWSync setup.
I have the PW files uploaded to my "web server" (via bluehost - It's an https address. Is this still unsupported?). What exactly do we do with the github/binary files? Place them in the same directory as the PW files on the server?
I host the PW with nwserver.exe on a spare PC so i'm not too familiar working with a server when it comes to hosting games and what not.
It's no video, but I've laid out a complete walkthrough for using nwsync - This doesn't include it's whole potential, but it covers everything needed for getting you up and running.
https://docs.google.com/document/d/1RXIf1vD-dE6p-ZzHs-PX4eyeMSYVbsYc89XvC8tscCs/edit?usp=sharing
That is most glorious. I appreciate that you delved into the making/hosting of an http server. I think, for the most part, that's where people get hung up. Perhaps jumping on the NWVault Discord for some specific help on that might be beneficial.
I still wish Beamdog would consider a simpler approach to this whole process (even putting a field for nwsync url on the Windows NWServer GUI would be nice step), but I realize it's unlikely to happen with so many other priorities. So this more user-friendly guide is a welcome help. Thanks very much, Mr. Draco!
I've been letting it run for 12+ hours and the "nwsyncdata" folder is only 400mb+ in size. Total size for all the PW server files are around 2gb.
Would running the server at the same time affect it at all? (i run it locally on NWServer. So same PC the NWSync process is running on)
You should also anticipate some slowing if you are planning on hosting your NWSync files on the same computer you host the game on. Or if you're using the same network (NWSync uses large chunks of data at a time).
I plan on hosting the NWSync files on a web server, not the same machine or my network thankfully.
If it has frozen (The program itself, not the GUI given I've not managed to thread it properly yet), you're totally fine to force-close and re-start the process without doing permanent damage, although depending where in the process it's up to it's likely to have to start-over.
Regarding the actual programs, the only added slowness from the GUI is the log - which you can run with Quiet mode to reduce - but that is minimal. Outside of that, any slowness is down to nwsync itself. Consider things like IO throughput, network limits or potentially if you're running it on a small VM or some-such which might be slowing the process.
As far as i can tell it is still running and adding files. I can see the time stamps of the folders being updated in the file explorer in the "sha1" folder. Almost every minute more folder dates are changing to the current time.
Edit: Ok, i'll look into those. Don't think any of that applies but will check to be sure.