• 0 Posts
  • 9 Comments
Joined 2 years ago
cake
Cake day: June 25th, 2023

help-circle
  • Apertus was developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. Particular attention has been paid to data integrity and ethical standards: the training corpus builds only on data which is publicly available. It is filtered to respect machine-readable opt-out requests from websites, even retroactively, and to remove personal data, and other undesired content before training begins.

    We probably won’t get better, but sounds like it’s still being trained on scraped data unless you explicitly opt out, including anything that may be getting mirrored by third parties that don’t opt out. Also, they can remove data from the training material retroactively… But presumably won’t be retraining the model from scratch, which means it will still have that in their weights, and the official weights will still have a potential advantage on models trained later on their training data.

    From the license:

    SNAI will regularly provide a file with hash values for download which you can apply as an output filter to your use of our Apertus LLM. The file reflects data protection deletion requests which have been addressed to SNAI as the developer of the Apertus LLM. It allows you to remove Personal Data contained in the model output.

    Oof, so they’re basically passing on data protection deletion requests to the users and telling them all to respectfully account for them.

    They also claim “open data”, but I’m having trouble finding the actual training data, only the “Training data reconstruction scripts”…







  • Dual booting is problematic, as mentioned you’re messing with your partitions and could mess up your windows partition, but also windows can, unprompted, mess up your Linux bootloader. As long as you’re careful with partitions and know how to fix your bootloader from a live image, there’s no real issue, but it’s worth keeping in mind.

    By the way, I recommend rEFInd for the bootloader when dual booting, it doesn’t require configuration and will detect bootable systems automatically.

    A VM sounds like a good idea to try a few things out, but do keep in mind performance can suffer, and you might especially run into issues with things like GPU virtualization. If you want to properly verify if things work and work well enough, you’ll want to test them from a live system.

    As a final note, you can give your VM access to your SSD/HDD - if you set that up properly, you can install and boot your Linux install inside a VM, and later switch to booting it natively. You still have the risk of messing up your partitions in that case, but it can be nice so you can look things up on your host system while setting up Linux in a VM.