Sadly, it is unmaintained.
https://github.com/jandinter/gesetze-im-internet
I scrape the official website (https://www.gesetze-im-internet.de) once a week. The repository contains the "official" XML files with a formatting that is more focussed on presentation than on the logical structure of the legal acts, unfortunately (https://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd).
Some time ago, someone from the digital service of Germany reached out and asked about my use case. Maybe there will be an official version of a "Git law" repo someday...
I wanted to build an "IDE-inspired" law reader. It has selection highlighting and you can open references within the same window. It scrapes gesetze-im-internet.de daily, processes the XML to JSONS and builds static HTML pages, hosted on Github pages. The entire build process for the 6000+ pages takes 5-10 minutes. It uses up less than <20% of my actions minutes that come with Github pro.
It was a really fun rabbit hole to go down.
What I found most fascinating is that: There doesn't seem to be an official version of the German law. The state just publishes official announcements like "Law X will be changed as follows", "Law X will be removed" or "Law X will be added". So the official version of the German law really is something akin to a git tree. AFAIK, all consolidated versions are created by private entities.
I did a test by picking a law at random, finding the first time it was published and then applying all the changes from subsequent years. Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes. I found that quite fascinating (and a little scary).
It's also funny how often laws are referenced that don't even exist anymore. The collection of laws really are is as tidy as you would imagine an 80 year old system, where the maintainers change every 5 years, to be.
Can you say more about what these small mistakes were? Would they affect the interpretation of the law?
buzer.de actually has a list of things that differ in their consolidation compared to gesetze-im-internet.de: https://www.buzer.de/quality.htm
In that list you can actually find mistakes that would alter the interpretation. But I think this also sounds worse than it is. It's just a funny thought that whatever source you are using, you are essentially trusting one party to not have made any mistakes, consolidating 1000s of pages of pdfs :)
But my point is that, as far as I know, there is no official version of the final text. The official publications are made in the Bundesgesetzblatt (which had been privatized in the past, but that's another story). The publications might look like this:
1947: We hereby make the following text a law called Grundgesetz "Artikel I: Human dignity is inviolable"
2026: We hereby change the law called Grundgesetz by changing the first article to say "Human or Alien" instead of "Human".
Now there are a lot of entities that will consolidate these changes into a final text. But this consolidation isn't done officially. So, while in this example its easy to see, that in 2026 the law would read "Human and Alien dignity is inviolable", it becomes less clear when these changes are spread over 80 years and are only available as PDFs.
[EDIT: fixed link]
I'm asking as I don't agree on the underlying assumption a use case was needed. I consider the value of transparency and public information for a democratic society as evident.
I'm also interested in the response btw :-)
In source code we replace or modify the parts that doesn't work in place. Many laws does not work like that, they are a labyrinth of add ons. A new law is introduced with wordings like "This replaces the words "small businesses" with "nuclear rockets" in the law on "Workplace safety of fishing vessels of 1992", §12, section 3, line 5.
No amount of version control will ever find these changes.
I actually think version control is an absolute necessity for laws.
So a country only needs to rewrite all the laws to adopt versioning, cool.
In reality both have can be used, commits to see what changed by whom and wordings that says what changed
No, they only need to start using versioning in order to adopt versioning. Think of an "initial git commit"
Changes will either add, delete or change an existing law.
There is actually a website where someone has all changes dating back to 2006 and you can display diffs (called Synopsis in Germany) - for example: https://www.buzer.de/gesetz/5041/v322454-2025-03-25.htm
It's mostly the obsolete system of common law where to have an understanding of what is legal and what isn't, you need to have a spiderweb of random acts (random as in, they don't have to be thematic, so the Chicken Tax Act of 2005 can have provisions on solar panels that replace the previous solar panel legislation form the STOP KILLING OUR COMMUNITIES Act of 1785) that build/replace on one another, sometimes going very far back, with associated precedents that clarify them.
The case happened to me when I searched for the original text that said a worker have to be compensated in full for "short" sick leave, and what I found was a very short text in German. Hopefully the company I worked for complied with law after consulting its accountant.
So you've called out precisely why version control systems present such a useful analog.
Of course one could also argue that it isn't a problem with poorly designed laws but that our programming languages are ill equipped for it.
Then again, the funniest thing I've seen in law: Where an engineer would make a nice drawing with the size of things neatly organized into available space, a law maker will spray all the numbers and description randomly all over the place as if to prevent anyone from ever building the described.
Hard disagree. It allows you to attach a name to particular portions of the code (and a date), it shows you when the code moves from one status to another (branches), and you could even easily do things like show who voted/signed for any given piece.
What's not really compatible with law making as it is now, where to repeal a law it doesn't remove the offending code, but adds more code that says "now you can ignore that previous one". Those don't even make it into the official text until codification occurs (this is periodic, not continuous).
>In source code we replace or modify the parts that doesn't work in place. Many laws does not work like that, they are a labyrinth of add ons. A new law is introduced with wordings like "This replaces the words "small businesses" with "nuclear rockets" in the law on "Workplace safety of fishing vessels of 1992", §12, section 3, line 5.
Exactly. They've been doing it wrong (artifact of doing everything on paper, I think).
Ja, there must be order.
Its not easy to convert general law to markdown, it involved online converters and manual fixes. Currently experimenting with marker [1] on local LLM hardware and so far it is the best out there.
One of the nice thing about having an underlying structured representation of those texts is that I can also render them to e.g. Markdown[1].
I've experimented about generating the Markdown files corresponding to multiple versions (archives) of a given text and committing them to the same Git repository to be able to see diffs or blames[2].
I would like to assign the proper dates to each commit, but given there are texts in e.g. 1791, it's not possible.
0: https://refli.be/fr/lex 1: https://github.com/hypered/iterata-md 2: https://github.com/hypered/iterata-archive
This is something I'm very interested in for a different use case: model legislatures. The infrastructure and tooling for model congresses and parliaments is very limited: largely relegated to wikis and Google Docs. And that's fine, but it becomes a problem long term with tracking and archival.
We had a situation where our model parliament did not own the Google Doc for a particular treaty with another model legislature. It was changed out from under us, which is not ideal. But that brings into question ownership of Google Docs, and what happens if that person withdraws from the game.
Another issue is respecting and maintaining the creativity of those who play. People put a lot of effort into their bills with the fonts, formatting, layout, and imagery they use. It would be a shame to erase all that effort by converting it a bland wall of text a la markdown.
Markdown also has its issues: if the legislature removes an entry of an ordered list, how do you prevent markdown from renumbering the list? And the ways around this involve extending markdown, or using plaintext (eg: https://www.apache.org/licenses/LICENSE-2.0.txt)
Another solution could be QuillJS (https://quilljs.com/), which serialises into a JSON array of Deltas. However, this would make any kind of git-diff difficult to read. You'd need a custom differ, which is not impossible, but that may be a lot of work and may not be supported on git sites like Github.
Another issue is that, if you're using commits-as-enactments, then that probably means using the commit message (or notes) for the enactment's text. To what extent is that supported? As in, how long can commit messages be before it starts wreaking havoc on git clients? Will my Github tab or GitKraken client crash if I view the commit history? Could the commit message itself contain a serialised QuillJS document? What if that document contained a base64-encoded image?
I don't know if they are doing it, but I always thought that it should be easy for regular citizens to see the historical reasons why a law or regulation exists. Because there is sometimes a good reason why a regulation exists, but nobody knows it.
So git was first created with u32 time in mind only. However because of the looming year-2038 problem, they are working on expanding that.
Apparently git internals are almost ready to support more interesting timestamps. However, much of the git tooling and UI (like command line parsing and output) refuses to deal with pre-epoch timestamps.
I briefly tried with git 'porcelain' and also via libgit2, but it's all a bit annoying.
In summary, I think you'd need to hack up at least some of git's tooling to make everything work, but it wouldn't be heart surgery, because the internals are already nearly ready for this kind of change.
At least you won't need to worry about figuring out historical leap seconds.
I think that would be a 'timezone' conversion you do at display time. Internally, it's still stored as a unix timestamp.
Tangentially, most RSS readers don't play nicely. A lot of webtooling doesn't like featuring e.g. old poetry etc. with the actual dates e.g.: https://alexalejandre.com/poetry/ I got a few e.g. newsboat to update their handling though.
Edit : I see you're focus is on another thing completely. Share it, could be a great topic also.
For example in USA we would have budget ceiling crisis, and both parties try to ram through a law to bump up the debt ceiling "to prevent government shutdown". It is being sold as a measure to keep government afloat and running, and is usually ran through pre-holiday like Christmas.
But what actually happens, is thousands and thousands of pages of various pork is rammed through with various cutouts and carveouts for special interest groups due to lobbying.
Public needs to know who when and how is adding these lines and how is bipartisan consensus is being achieved in real-time, not post-factum.
This is a horrible idea in practice because everything that is public and open turns into a purity test.
You need people to be able to negotiate with each other in order for consensus to be established, and negotiations only work if the negotiators give up on something that they want to get something else. The moment you make this public all that you get is people turning negotiations into a way to generate soundbites and scared of doing any actual work because they'll just give ammunition to their opponents.
This is doubly bad in the US with the primary systems which makes legislators even more vulnerable to attacks from their flanks.
Legislators are not elected to be proxies for the voters, that's not how it's supposed to work. They're elected to use their judgement, that's why there usually aren't recall elections or restrictions on how they can vote etc.
As a matter of fact I'm of the opinion politics everywhere would be a lot better if plenaries, committees and hearings were not recorded or televised in the first place. I'm ok with minutes being made available but I'm convinced without being able to clip soundbites or tiktoks out of every meeting legislatures would be a lot more productive. Definitely more so than if we attached a camera crew to everyone in politics for "transparency"
At that point we might as well get rid of the press, as otherwise someone might be able to hold someone actually accountable to their actions and decisions. Taking the argument ad absurdum, might even go back to monarchy so we don't have to deal with informed (or quasi-informed) voters to begin with.
I get where you come from, that the public perception of politics is mostly soundbite-driven is indeed a huge issue, in my opinion probably one of the biggest issues of our century, as it allows absolute incompetence a democratic pathway to power by playing to human basic instincts and emotions.
But as long as we want to cling to democracy, the voters _must_ have a way of knowing who is doing what, who is involved in which decision, and what favors are being traded. How else is a voter supposed to make an informed choice?
EDIT: To address the soundbite-problem, I think systems that are more oriented towards consensus democracy (proportional elections, chance for referendums etc.) rather than competitive democracies (first past the post, majority takes all) are more stable against it. Election systems should favor choice of opinion rather than choice of persons, if that makes sense. I think especially the US (for context, I'm Swiss) would benefit a lot from such changes; right now it seems all outrage-driven.
Minutes are a thing, you know? And I'm not saying all sessions need to be held behind close doors, I'm perfectly fine with journalists or the public being present
> _must_ have a way of knowing who is doing what, who is involved in which decision
They do, that's what elections, roll calls and minutes are for
> and what favors are being traded
You're implying this is actually possible, it's not. Favours will always be traded in secret and deals made. All that the radical transparency proposals do is making sure that compromises can't be done effectively in official settings
Government and officials will fight to the teeth to avoid accountability and transparency because that's where the money and power is.
Alas, that sounds like a great idea in principle, but is probably a bad idea in practice.
Speeches in parliament (or on the senate floor, in the US) are already public. And that's a big reason those speeches are useless: they are just used as grandstanding to the general public.
The real work in finding compromises happens behind closed doors. That way you avoid producing sound bytes that can be used against you next election season. Especially from challengers in your own party, who could otherwise accuse you of being insufficiently pure.
I'm afraid an ugly compromise of muddling through with some transparency is the best we can get in practice. At least if your democracy features voting, and especially first-past-the-post voting.
As one alternative, filling your parliament up via sortition might eliminate the downsides of transparency.
This is what the press and various independent groups already do. They have people that pour through the stuff, as well as getting tips and press releases from congressional reps and third party interest groups. It's just that there's only so much they can cover, and most of the public can't be bothered to do more than turn on the evening news.
There's a lot of good, in-depth journalism out there. You just have to look a bit harder for it.
Most git repos are not adversarial, so they get away with using it. A malicious commit which put a loop or a fork in your commit history would just be rejected--if not by the software, then by the maintainer.
But if you're tracking the state of some legal procedure in congress or whatever, you really don't want anybody playing games with history, since presumably it would determine important societal outcomes like whether a bill became law.
> we don’t care about people sneaking things into our git repositories
But rather that in the event of a collision, git would not sneak the attacker's malicious code into your repo. The best such an attacker can expect to achieve is to create confusion. In a project-maintainer scenario, that probably just means rejecting the PR--hardly an outcome that would justify spending the money on the hash collision in the first place.
In a we're-voting-on-whether-or-not-to-change-the-law scenario, confusion about the outcome of the vote could have dire enough consequences that an adversary might indeed care enough to bother with the calculation.
That's not to say that this is the only way in which git would be a bad choice, it's just the first that came to mind.
The idea that there's some sort of forced budgetary event on the calendar, related to the debt limit, that's then raced through to have competing solutions, then laden down with pork is Not Even Wrong, i.e. in the Pauli sense, and plainly wrong.
You raising the October date does leave open that maybe they're thinking of budgets and think the minority party in Congress has peer footing as a solution / is required to make a budget. That, at least, would explain the pork mention.
https://thehill.com/homenews/senate/5046873-rand-paul-johnso...
https://thehill.com/homenews/senate/475831-mcconnell-flexes-...
Separately, I can't tell how the articles are related.
I am not claiming there has never been a debt limit issue in US politics, if that is what you are asking.
Like I get it, but English is fluid and is it really that far to make the assumptive leap from "git" to "github/lab/service", seems pretty clear what they meant (even if it's not completely/technically/um-actually correct because git != github).
(Okay there's also `git` in the URL, I noticed, which means whoever made the page had that mental mapping in mind but still...)
I just feel clickbaited that this item is at the top of this forum. I'll stop engaging with this now. Nothing of interest for me here.
What one usually need, is for example, to have cross-references, i.e. when a law contains phrase like "the certification is issued by a relevant authority", you want to have the "relevant authority" wrapped in a hyperlink pointing to an government order that designates that authority. Also, you typically want to have links to court cases related to some paragraph near it etc. If some change is planned to the law you want to have a note in the text like "this is going to change on September, 1", etc.
Given that many countries have local laws, you might want to be able to filter by a place.
Github might be used for storing raw documents as some weird kind of a database, but it is useless for actually trying to find out the answers to legal questions.
To make an analogy, reading laws on Github is like reading source code without syntax highlighting and navigation.
By default, git's diff and merge want lines-of-code to be meaningful and are set up for that.
Or you can write your own diff (and merge) driver and plug it in, git is flexible enough for that. At least in theory. The original idea was mostly to offer that to help with eg tracking of binary formats.
I'm not sure this feature is actually widely used enough to have gotten enough polish to be useful.
So in practice you might be better off changing your data format slightly, to make lines-of-text a meaningful thing, and then use git mostly as is.
Why should legal texts be any different as far as git is concerned?
So my point was that git is obviously useful for source code of programs.
And as you point out, git does not provide 'go to definition' for source code of programs.
Hence I suggest that the inability of git to provide cross-references in legal text is about as relevant (or rather irrelevant) to the discussion at hand as git's inability to provide cross-references is source code.
Does this make sense?
Git is useful for collaboration of multiple people on the same project. Is law making a collaboration? Typically there is a single person which signs the bill into a law. But there is collaboration during work on the bill, though.
But I do not think that people who make laws want to write git commands in the console. They want the GUI (ideally integrated into Microsoft Word). And if we are making GUI why not drop git and use a traditional relational database for storing the data?
You are right that the default UI of git is intimidating for normal people.
Saying 'relational database' says about as much about how you actually store the data as saying 'json' or 'xml'. Yes, you could use a variant of git that stores all its information in a relation database. (And in a saner parallel universe, git might have used something like sqlite internally, instead of hand-rolling its own formats from scratch.)
But the question of UI is pretty much independent from the question of how you want to store the data.
Although Microsoft is especially good/evil at getting Azure lock-ins and overpriced exclusive deals from e.g. governments.
Github is high quality, software engineering enablement platform
And git is letter management engine
This needs human curation to be useful.
Unfortunately I think those commits are the laws (depending on the country, of course, speaking for the cases I know). There isn't, except in a few cases, a notion of documents those commits are applied to -the commits are applied to previous commits. It doesn't make a difference functionally, but makes the system incredibly messy, hard to maintain and to navigate.
When we do 'git show' we typically look the diff to the parent commit. But in git that's derived data. The 'snapshot' of your files is the fundamental point of view in git, and they are the source of truth (in some sense).
As you they, they make the diffs the source of truth in the US. I think that's how Subversion used to work. I don't know if that's still true, though.
The git model makes diffs between arbitrary commits about equally expensive to produce, no matter how far away in the commit graph they are from each other (if they are even connected at all). It also makes complicated merges conceptually simpler.
In Germany for example, there are regular laws (if you need to know what's legal and what not, these are the ones you read) and change laws (Änderungsgesetze), which usually read like
"Law XYZ, Paragraph 5a, Sentence 3 is changed to read as follows: '...', Sentence 7 is appended with '...', Sentence 8 is removed"
So at any time, you've got a clear set of laws that apply.
That is basically a manual description of a git diff.
The conceptual differences, and the differences in content.
You are right that the conceptual differences are murkier in practice than in theory:
Even in civil law countries, judges tend to defer to precedent, even though they aren't strictly bound by it. And in common law countries, if you are a judge and you don't like a precedent (even a precedent set by a higher court), you can 'just' find enough differences between the case in front of you and the precedents you don't like, and argue away the differences with the precedents you do like.
Content-wise, there's a big difference in how eg German law sees contracts and how English law sees them. But you could (more or less) embed the English definition of contract in a civil law system, just as much as you could embed the German definition in a common law system.
In practice the main difference between English vs German contracts is in the legal boilerplate and fine print. You can think of the boilerplace text like a 'polyfill' or compatibility layer that gets you closer to what you actually want to implement in the main text of your contract.
Pure common law would be the complete absence of a separate legislature. Everything would be just layers of judicial decisions
Pure civil law systems would have judges always ruling exclusively on the case in front of them based exclusively on the text of the laws. But in practice especially at the higher level of the courts and especially in matters of constitutionality or higher laws you have a certain deference to precedence
The advantage of this system is mostly that you very quickly end up with a giant body of legal text and any edgecases to your laws get hammered out very quickly and consistently. It allows for a more populist lawmaking system, since all the fine details are sorted out by the courts. It frees up politicians from considering how their laws will actually be enforced, since after a law is passed, the only effect it can have is on their political history. It also works well if you have a colonial empire where laws could take several months to spread across the empire, but you needed decisions to be taken in the here and now - common law moved some of those complexities to the local judges, where unless they'd be curtailed by "up high", they could be more practical enforcers of the law.
The disadvantage of this system is that you get a very large body of legal text fairly quickly. Actually interpreting common law tends to net you a quagmire of legal precedent that can often be more complex than the law itself and because there's a lot of it, you end up with the situation where a case can have multiple kinds of contradicting case law applied to it (in which the most recent version would "win", or the version decided on by the supreme court). It can turn legal arguments less into "interpret the law" and more into "throw enough citations at your opponent to force the judge's hand". A related problem is the risk of setting bad precedent: it's possible to craft "the perfect case" for certain legislation, leading to undesirable precedent being created, even if the law is correctly interpreted for that situation. This happens very often in the US, and is why Americans in my experience have no problem defending actively horrible cases for their one-issue cause, because they're more terrified of the precedent it'd set for the future. (The Internet Archive lawsuit is one such situation, where most Americans were more terrified about the precedent it set for copyright rather than a fair judgement on what IA actually did.)
Civil law on the other hand has no innate concept of stare decisis. Laws are as a rule only interpreted as they are passed. This doesn't mean case law doesn't exist, but it's not capable of forcing a judge's hand if the judge can find an argument to not follow it and the decision of one judge can't have ramifications down the road - you need multiple judges across multiple cases to find similar conclusions before it becomes case law. (This is called jurisprudence constante.)
The main advantage is that the total amount of legal text you need to understand the law is much smaller: if you get a law book, it'll just outright contain the majority of things you need to know about the law. There's still the usual "legal definitions don't always follow the common definition" going on, but that's not nearly to the same degree as happens with common law. It also forces the hands of politicians to consider much more about how the law is going to be applied in the future; a badly written law can have a lot of unwanted side effects, making it an easier target to strike from the books (which also tends to be more important in civil law; common law tends to leave bad laws on the books if case law happens to overturn it, which can lead to problems if the case law overturning the law is overturned). It gives "administrator" style politicians more leeway, since they need to actually interact with the bureaucracies that are going to enforce the law to make it work. There's also no "perfect case" situation of creating undesirable jurisprudence on accident. Judges don't have to consider the broader ramifications their decisions can have in those cases either, since they aren't writing new laws by deciding on those cases.
The disadvantage is that what you gain in comprehensibility, you lose in consistency. A judge can just outright decide to interpret the law in a different way from what's written and the recourse you have is pretty much to appeal it, demanding a retrial. This gives greater flexibility on "edge cases" where the law isn't super clear (without a worry for the greater ramifications that deciding on an edge case can cause), but can also lead to what should be an open and shut case getting a weird judgement (although the impact is fairly small ultimately - this is for example why the Hamburg Regional Court can keep farting out the most braindead copyright decisions that go directly against other laws and legal interpretations, but there's basically no downstream effect). Lawmakers try to prevent this by including "recitals", which are basically written addenda that try to clarify what each section of legal text is trying to address. This leads to the weird situation where the effective total body of law is smaller, the legal texts themselves tend to be much larger once you take recitals into account.
Personally, I prefer civil law over common law because of the lessened impact a bad decision can have.
That's why I say there's no strong distinction. They're just like slightly different flavors of the same basic principles.
What similar resources are out there? Any favorites?
https://arstechnica.com/tech-policy/2018/11/how-i-changed-th...
They don't appear to publish the XML formats they're using.
There's no description of their fees/pricing.
None of the software is open-source. There's barely any details about how the software even works.
It's about as opaque as they seem to feel they can get away with. "Email us for pricing" is not how a service like that should work.
No, I stand by my original phrasing: The important part here is that legislative changes are being recorded and shown through a version control system (i.e. git) not the fact that one of the repos is publicly visible via one of several possible web-GUI services.
It would still be somewhat laudable even if the usage was: "Here's the repo, clone it yourself an poke around with desktop tools." In contrast, the opposite mix of "here's a website that shows diffs but you can't have the underlying data" would suck.
> They don't appear to publish the XML formats they're using.
I see XSD files...? Plus, this original design doc is probably still relevant. [0]
[0] https://github.com/DCCouncil/dc-code-prototype?tab=readme-ov...
What's important in the story is that the law went from being not open to open and the law-publication-process was modernized internally. The fact that it ended up on GitHub was the least important, but most fun, outcome.
GitHub adds nothing of any value for the transparency and accountability of the lawmaking process (I mean, what lawmakers do), but it is a great platform for publishing structured data files for the law to create open access.
Aka a heatmap.
What happens when these get solved in a Stakkato timeframe shows the current US government which at this point even fails to fulfill basic needs of supermarkets. Proper laws/initiatives aren't created/fleshed out by actual politicians but by a whole army of employees of the related institutions.
Everything else is pure populism.
git blame
I wonder how easy it is to adopt this project to that?
https://xcential.com/blog/version-control-for-law-tracking-c...
And everybody should be able to submit improvements to the law.