stouset 20 hours ago | next |

SAML is absolutely insane. It’s three separate specs: one that defines what every XML element means semantically, one that defines multiple document models that you might want to combine those to use, and one that talks about network protocols you might want to use those documents in.

It’s insane and inscrutable.

I previously worked at the company that first created this gem. It was not written based off actually reading the spec. It was based off a loose examination of what other legitimate docs in the wild looked like, and built to parse those.

Which of course meant that early on it was vulnerable to everything since it was built to fit positive results and not negative ones. This isn’t even the first XML signature issue: early released versions didn’t even bother to check that the part being used was the part that was signed. If any part of the doc was signed and valid it was good to go.

Fun times.

userbinator 17 hours ago | root | parent | next |

In my experience, anything XML-related seems to be the product of simplicity-hating architecture astronauts with zero consideration for efficiency, possibly as a way of justifying their existence and continued employment.

Standards based on ASN.1 get a lot of hate (X.509 etc.) but I'd rather work with that than XML.

darby_nine 5 hours ago | root | parent | next |

> In my experience, anything XML-related seems to be the product of simplicity-hating architecture astronauts with zero consideration for efficiency, possibly as a way of justifying their existence and continued employment.

I am very confused by people who have emotional reactions to technologies. XML has a number of capabilities that are very difficult to represent in other documents without creating an unreadable mess. XML is more than just the worst SOAP api you've used.

lcnPylGDnU4H9OF 9 minutes ago | root | parent | next |

> I am very confused by people who have emotional reactions to technologies.

Given the phrasing ("simplicity-hating architecture astronauts"), I suspect they're having an emotional reaction to a colleague who, at some point, advocated for their team's use of XML. Which I honestly think is slightly justified if their colleague's advocacy was thoughtless or otherwise unreasonable. Maybe JSON would actually work better for what they were doing. Of course, that said, I also say it's "slightly" justified because good ol' XML did nothing wrong.

AtlasBarfed 3 hours ago | root | parent | prev |

......such as?

Xml is a hierarchical days structure that has less data hinting to parsers /serializes than json.

Please don't say namespaces. Broke xpath and other parsing techs. Attrs vs tags? Cdata? Any one remember ibm web services being all cdata tags?

zerkten 6 hours ago | root | parent | prev | next |

This is very true. You only have to look at the list authors/editors of the specs from Oasis to see that they square up with the individuals identified by Joel in https://www.joelonsoftware.com/2001/04/21/dont-let-architect....

If XML was removed, these specs wouldn't have been much better. The motivation of the companies sponsoring specs was to build something that sold more enterprise middleware and identity servers. This was never going to be attractive to the individuals working with IETF, or the people working on web standards who'd create WHATWG (I know SAML isn't their domain.)

Many readers here also won't be aware of how web services, SOAP, and XML were the AI hype of the time. These were getting pushed into every kind of solution. At the same time, the alternative options for document and serialization formats weren't ubiquitous, so having XML everywhere was somewhat pragmatic for the average developer. I can't imagine ASP (or early-.NET) developers trying to deal with ASN.1.

amluto 17 hours ago | root | parent | prev | next |

I find XML to be perfectly fine as a markup language. I once set up a system to generate logs as XML elements and display them prettily with XSLT. It was delightful!

Using XML as an interchange format for things intended to be read by a machine is not so great. Don’t use it where you actually want something more like protobuf.

bawolff 15 hours ago | root | parent | next |

XML isnt great for that, but the xml part is the least of the concerns. XSignature and SAML are insane specs, and would still be insane if the underlying presentation language wasnt xml (i mean, then you wouldnt be using xsignature, but if you made sone xsignature like thing for a different presentation language, it would still be crazy)

zaik 11 hours ago | root | parent | prev | next |

XMPP is still a simple and efficient chat protocol. But it makes sure to only allow a sane subset of XML.

gwervc 10 hours ago | root | parent | prev | next |

You are throwing the baby with the bath water. XML is fine, even if it has some gotcha. Astronauts will go wild with any tech at their disposal (I'm seeing it every day at work with .net and JSON).

tannhaeuser 13 hours ago | root | parent | prev |

You're not wrong. The amount of fields of use where XML is used bogusly which have nothing to do with markup is truly staggering. XML signing and relying on canonical XML serialization for it is just peak and something else.

The sad thing is that XML was meant as a simplification over full SGML for delivery of markup on the web. Specifically, XML is always fully tagged (doesn't make use of tag inference), and does neither have empty ("void") elements nor short forms for attributes such as in <option selected>. Thus XML never needs markup declarations for special per-element or per-attribute parsing rules. This was done to facilitate newer vocabularies next to HTML like SVG and MathML.

But soon enough, folks took the XML specification as an invitation for complexity and a self-serving spec circus: namespaces, XInclude (as a bogus replacement for entity expansion), XQuery, XSLT, XML Schema as a super-verbose replacement for DTDs using XML itself, etc. XHTML 2 was the largest failure and turning point, introducing not just a new vocabulary, but trying to reinvent how browsers work in a design-by-comittee fashion. It could be said that XHTML took W3C down along with it.

For message payloads in large and long-term multi-party projects (governments, finance/payments, healthcare, etc.), I'm however not sure the alternatives (JSON-over-HTTP and the idiotic quasi-religious apeal to misunderstood "REST" semantics) is really helping. XML Schema, while in part overkill and unused (substitution groups), certainly has facilitated separating service interface from service implementation, multiple generations and multiple implementations, test cases bases, and other long-term maintenance goals.

eftpotrm 10 hours ago | root | parent | next |

I'm not going to argue that XML hasn't been used badly and excessively in a lot of places, it really has, and using every part of it religiously will tie you in knots, fast.

But I can't help noticing that Json is gaining more and more XML-like functionality through things like schemas and JsonPath, as people slowly realise why XML had those functions they're now having to replace. I'm a long way from convinced that all the engineering effort to switch was actually beneficial.

dwaite 3 hours ago | root | parent | next |

And schema and paths have much the same issues - they are being used as tools in things like network-exchanged messages when the underlying specs and the implementations out there were not designed with that idea in mind.

You are going to have a bad time if your schema validation tries to resolve schema URL by default.

You are going to have a bad time if your JSONpath implementation supports the older "eval" mechanisms, or has unbounded memory/processing time growth from top-down traversal of the JSON.

The issue in the article was purposely avoided in JSON by virtue of JWS not having canonicalization, transforms, or partial signatures. You sign a chunk of binary data, and that binary data might be parsable as JSON.

darby_nine 5 hours ago | root | parent | prev | next |

> But I can't help noticing that Json is gaining more and more XML-like functionality through things like schemas and JsonPath, as people slowly realise why XML had those functions they're now having to replace.

I think there's an analogy here to static typing and gradual typing. XML is a massive pain in the ass to implement and JSON is often good enough. Only having to implement the features you plan on using is quite nice.

eftpotrm 3 hours ago | root | parent |

For who though?

If you're a user of whatever-data-format designing your new application, you could always use the subset you actually cared about. No-one forced you to use all the complex bits in XML.

If you're a library author - well, yes, you could implement a Json parser at first that was eval(input), then something more complex because that's a security hole, then something else again because that's not too quick, then a new library like JsonPath to get queryability, and... all your work is still less functional than the system you were trying to replace. So yes, you can possibly implement Json libraries in less code than implementing XML libraries. But unless you had a reason to implement a new XML library from scratch anyway, that isn't actually a win.

darby_nine 2 hours ago | root | parent |

> No-one forced you to use all the complex bits in XML.

Just parsing XML alone is hugely painful, let alone implementing the rest of the stuff like XSLT, namespaces, validation, xpath, etc etc. Plus, once you've done this, you still need a natural way to map this into domain types, or you need to force people into a visitor pattern or some other awkward deserialization technique. JSON just requires a single JSONValue sum type.

XML has its place, but it'd have to be a pretty extreme case of needing rigor or a tree where I need to be able to peg arbitrary attributes onto the nodes in order to see it as attractive. Most APIs won't benefit from all of XML's features.

For instance I maintain a podcasting/rss feed library and XML (And more importantly, the way people publish invalid xml) makes me really wish they had gone with a different format in the day that was harder to fuck up.

darby_nine 5 hours ago | root | parent | prev |

> have nothing to do with markup

Yes, it's a badly named language. It has nothing to do with markup. As always, intentions don't matter at all, and it's the best tool we have for certain types of structures.

dwaite 3 hours ago | root | parent |

It is a properly named language. Just a staggering majority of XML use is for a purpose it was not originally designed for. It is meant to be a good tool for progressive enhancement text markup, such as stylizing hypertext or math equations.

darby_nine 16 minutes ago | root | parent |

> It is meant to be a good tool for progressive enhancement text markup, such as stylizing hypertext or math equations.

Well, at least hypertext. I've seen the disaster that is MathML.

bradly 17 hours ago | root | parent | prev | next |

I was jamming this gem rails back in 2009-2010 and will tell you we had no idea what we were doing on our side either. We were a couple a Rails devs at a tiny start up implement Qualcomm's SSO and tbh I'm surprised it actually worked.

There wasn't a two-legged oauth gem at the time so I remember writing one and being blowing away at how much I actually understood the OAuth 1.0 2-legged spec.

ucarion 20 hours ago | root | parent | prev | next |

You'll be pleased to know that we're not making a ton of progress on the "split things over N docs" front.

In recent years IETF has given us SCIM (which is sort of like "offline SAML") which is 3 RFCs (goals, schemas, http stuff), and of course JWT is actually part of a series of like 9 RFCs (including JWT, of course, but also JWK, JWS, JWE, JWA, ...).

I think there's this phenomenon where people who are like "dude, nobody cares, just do the dumbest possible thing we can get away with" aren't the people who decide to get involved in writing security specs.

victor106 19 hours ago | root | parent |

> SCIM (which is sort of like "offline SAML")

If you are talking about SCIM (System for Cross Domain Identity Management) then it’s very different from what SAML is. SCIM Is used for user provisioning where as SAML is used for SSO.

bfrog 19 hours ago | prev | next |

Signed xml alone is a wildly confusing idea, as the signatures get embedded as elements in the document being signed. There’s a wild set of rules on how to make xml canonical, sign, add the signature, etc. It’s nontrivial.

vbezhenar 18 hours ago | root | parent |

What's confusing about it? Everything seems pretty obvious to me.

bfrog 7 hours ago | root | parent | next |

To clarify, in signing you have to convert xml to bytes you can get back in the other side, while modifying the same bytes injecting signatures. The whole custom canonical xml serializer is actually complex with escape rules and a bunch of other insanity. On the other side you have to do the opposite by dropping the signature element and serializing the same way.

Worse this is done at an element level not a document level as noted in the linked article.

Really, it’s not that simple. It typically requires a while xml library for dealing with it that is error prone. Check the number of errors and cves for libxmlsec for example. Or even the versions in C# or Java.

nimish 7 hours ago | root | parent | prev | next |

Xml canonicalization is insane but necessary. Far more complex than the signature process itself

Then the incredibly stupid need to modify the signed document to insert the signature online so verifying it requires a full blown parser among other things

captn3m0 18 hours ago | root | parent | prev |

Adding Signatures to an existing document, no matter the format is just a whole bunch of trouble.

maxbond 17 hours ago | root | parent |

To expand, generally you wouldn't want to change the identity of the document by signing it (eg change it's hash). That's bananas. If the signature was external to the document, you wouldn't need any complex and error prone rules to canonicalize. You'd just generate an HMAC tag and send it alongside (or, better yet, use an authenticated encryption like AES-GCM).

jahewson 15 hours ago | root | parent |

The sane thing is to sign bytes, as you suggest. But OP is right that it needs to preclude adding signatures to a document.

agentultra 6 hours ago | prev | next |

Businesses that want to integrate with larger SaaS providers and enterprises are often compelled to implement SAML. I used to fight tooth and nail to avoid it over issues with the SAML spec but... business is business.

Good suggestions from the article: work around it. The non-technical folks may force you to implement it in your system. Doesn't mean you have to leave your systems vulnerable.

benmmurphy 10 hours ago | prev | next |

i think the problem is signature verification APIs should return the signed data or an error and then the consumer should use the signed data from the API and not any other data. then there is no confusion over what was signed or not.

in the case of XML signature verification they probably should return a list of (XMLElement, Path) tuples. so the actual XMLElement that was signed and verified by the API and a path to the element in the document. having APIs that return IDs and then make assumptions that the signature verification code and the consumer code is going to perform resolution the same way is dangerous. even returning the path is a potential footgun but I assume consumers of an XMLSignature need to be able to check that elements appear in certain places in the document. i guess also DOM model APIs are probably implicitly returning a path if they support navigating by `getParentElement()`.

quickgist 15 hours ago | prev | next |

I love this quote from the blog:

> Why are we making chandeliers out of swords of Damocles?

Amazing description of proliferating footguns.

lifeisstillgood 12 hours ago | prev | next |

This article actually reads like a dev who understands the problem and has an opinion.

Where else can one find such writing about security issues ?

caust1c 21 hours ago | prev | next |

I know very little about XML and SAML, but from what little I do know it shocks me that it's still the de-facto standard for SSO.

Great analysis and thanks for sharing!

tptacek 21 hours ago | root | parent |

It should not be, and people should use OIDC in preference to it wherever they can.

Roguelazer 20 hours ago | root | parent |

I'm optimistic SAML will be dead soon. ActiveDirectory/EntraID/whatever Microsoft wants to call it now supports OpenID Connect. Okta, OneLogin, Google, and all the other post-turn-of-the-millenium IdPs support OIDC. Shibboleth is the last major IdP I know if that is SAML-only, and I haven't seen anyone using it in like 10 years. When I built enterprise SSO for my current company, we went OIDC-only and we haven't had a single customer who needed SAML.

jrochkind1 18 hours ago | root | parent | next |

> Shibboleth is the last major IdP I know if that is SAML-only, and I haven't seen anyone using it in like 10 years

Most universities are still using Shibboleth. And probably will be forever. I think Shibboleth influenced SAML, probably to it's detriment.

Griever 15 hours ago | root | parent |

Yup, thankfully most federate through InCommon so it’s less painful than it used to be, but that’s not saying much.

Johnnynator 13 hours ago | root | parent | prev | next |

> Shibboleth is the last major IdP I know if that is SAML-only

Shibboleth has officially supported Plugins for OIDC for some time now.

As others said, Shiboleth is still rather pupular at Universities and higher Education, OIDC will have a hard time to set foot there without the OpenID Connect Federation Draft beeing finished and then Implemented by the different Metadata Federation that exist (most National Research Networks manage one)

zdragnar 19 hours ago | root | parent | prev | next |

Working in the health market, pretty much the only thing our customers support is SAML, and that's only among customers who have anything at all that can integrate with us.

hirsin 18 hours ago | root | parent | prev | next |

Okta barely supports OIDC I'm afraid. We have to use SAML with them because they don't support a reusable app model for OIDC (a "marketplace app" that multiple customers can use).

I'd love to add FastFed support for OIDC and be done with it but SAML still rules the world.

layer8 5 hours ago | prev | next |

> All an XML signature does is let you cryptographically sign an XML document. Same thing as what JWTs do with alg: "RS256" (no ES256, because remember: the year is 2000).

This stopped being true in 2005, see RFC 4050.

agentultra 6 hours ago | prev | next |

> Ignore Postel

Pretty good advice. I believe it should be the default. The situations that require permissiveness should be exceptions and treated with a high degree of scrutiny.

bawolff 20 hours ago | prev | next |

SAML has to be one of the worst security specs ever

hsbauauvhabzb 20 hours ago | root | parent |

Why do you say that? I think it’s ugly, but it’s substantially simpler to understand than oidc. What parts of the spec (read: not just shitty implementations by developers) are bad?

I’m genuinely curious here, I’m not attempting to bait an argument.

bawolff 17 hours ago | root | parent | next |

Two things

Saml itself is sort of a kitchen sink. It includes everything you could possibly ever want, but nobody implements all of it so you need to figure out common subset, which defeats the point of a standard.

Second, XMLSignature sucks... like badly. Only part of the response is signed, but which part there is no standard on. It is way too complicated. Why have multiple overlapping signatures is crazy. Comments arent signed but change meaning of document. A billion signature types. Etc.

Freak_NL 13 hours ago | root | parent | next |

XML signatures in SAML suck so much they deserve to be your point one. For functionality at least it's possible to just poke around and see what works with whatever party your connecting, but debugging broken signing? With XML signatures it is possible to have it all working with one provider (perhaps a Windows machine running ADFS) and then be unable to verify the signatures from another, and you'll never know where the fault lies.

At least with modern stuff like JWT's the ways to encrypt and sign are well-understood.

mdaniel 4 hours ago | root | parent | prev | next |

> Comments arent signed but change meaning of document

Do you have an example of that assertion handy? The only comment-influences-execution behavior I'm aware of is in SQL[1], and I haven't ever seen any XML system (in any business domain) which does what you said

1: I mean, setting aside linter suppression, which pedantically does impact execution but I meant of the final software

bawolff an hour ago | root | parent |

https://duo.com/blog/duo-finds-saml-vulnerabilities-affectin... has the full details.

But basically in some xml apis, a comment can split a single text node into two adjacent text nodes. Some implementations would only look at the first text node. The original xsignature spec (although i think this has been changed) said to remove all comments from doc before signing it, so the attacker can add arbitrary comments without messing up the signature.

silon42 11 hours ago | root | parent | prev |

Personally I've found one of the few sane uses of XMLSignature is just to use only enveloped-signature, where the signature is then removed from message before processing... also it can be composed by nesting them (carefully).

tptacek 19 hours ago | root | parent | prev | next |

It is drastically harder to understand than OIDC, in large part due to XML signatures, which are a demented format, mostly for the reasons stated in this blog post (but also for some reasons it shares with JWT, and also for some sui generis reasons having to do with how complicated xmldsig is and how few implementations there are of it). You really couldn't find a worse format to do cryptography with than XML.

unscaled 17 hours ago | root | parent | prev | next |

It's not just shitty implementations here. The designers of SAML and XMLDSig cannot just blame the developers for implementing their "perfect" spec incorrectly.

The blog post above details exactly why XMLDSig can only be implemented securely, if you explicitly make an effort to ignore the spec. When following the specification leads to insecure implementations it's the spec that's shitty, and the spec authors should carry the blame.

The Open ID spec isn't great either and has its own share of issues, but in most scenarios, it doesn't rely on signature validation. If you only use the authorization code flow, breaking the ID token signature becomes ineffective, since the attacker still needs a valid authorization code from the IdP for this attack. If you restrict your implementation to what is allowed in OAuth 2.1 [1] or follow OAuth Best Practices [2], you can implement Open ID Connect pretty safely, as they eliminate the implicit grant and introduce PKCE.

I sure wish the OpenID foundation would cut all the unnecessary bloat in their spec(namely the ID Token, Implicit and Hybrid Flow and unnecessarily client-side token validation rules) and leave it as just a simple extension to OAuth 2.1 that specifies a few extra parameters and a User Info Endpoint. But if we have to leave with this over-engineered spec, I can still trust that implementations of OIDC would fail less horribly than SAML.

[1] https://oauth.net/2.1/

[2] https://oauth.net/2/oauth-best-practice/

bawolff 15 hours ago | root | parent |

> It's not just shitty implementations here

I agree 100% the spec is shitty, but on top of it,some of the implementations are really weird beyond the spec. there was a prominent c library for it that (last i checked) in the default config added a custom hmac signature version where the hmac key is embedded in the attacker control document, and also hooked into the the system web pki, so if the provided key doesnt match, it will test if the doc was signed by a tls key from any website in the world.

jiggawatts 18 hours ago | root | parent | prev |

It’s also a nested meta protocol with an extensible markup language (XML) used to express extensible fields but using SAML encoding for them instead of just XML. It’s the inner-platform effect, which is common in over engineered monstrosities like SAML.

zb3 19 hours ago | prev | next |

Unfortunately XML signatures are also widely used in Polish government APIs which citizens/companies are required to use :(

vbezhenar 18 hours ago | root | parent | next |

Same here in Kazakhstan. And we also use our home-made crypto algorithms (derived from USSR GOST), which are not present in popular open source libraries.

notpushkin 10 hours ago | root | parent |

Are those different from Russian GOST algorithms? I think there’s a bunch of libraries (mostly forks of other popular open source libraries) for that.

magicalhippo 18 hours ago | root | parent | prev | next |

Same in Sweden and Denmark, several gov't systems requiring signed XMLs. And before you think legacy systems, no, these are the new systems, with rollout starting a few years ago and still ongoing.

dudeinjapan 18 hours ago | prev | next |

Recent RubySaml contributor here. The problem in this issue is not only RubySaml, but actually much older code in a module called XmlSecurity.

Some major problems with SAML are 1) the user’s browser acts as a MITM between the SP and IdP on all requests (vector for this attack), and 2) it requires the IdP and SP to maintain their own certs, which is fine in theory, but humans at big corps are lazy, and the complexity causes people to be lax on security.

SigmundA 17 hours ago | root | parent |

>1) the user’s browser acts as a MITM between the SP and IdP on all requests (vector for this attack)

This is exactly how OIDC implicit flow works. The basic difference is using JWT instead of signed XML otherwise it's nearly identical, I mean public/private key signing is the basis for JWT and XML sig.

SAML also supports artifact binding which would use a back channel similar to other ODIC flows, but I haven't seen it used much because its make things more complicated and requires the SP to be able to communicate with the IdP.

SigmundA 20 hours ago | prev |

Microsoft's SignedXml implementation in the .Net framework fixed this 8 years ago so long as you are correctly using the GetIdElement which makes sure there are no duplicates.

https://coding.abel.nu/2016/03/vulnerability-in-net-signedxm...

silon42 12 hours ago | root | parent | next |

And they "forgot" to tell people that if doing this properly, they need to use a schema/DTD where Id is defined as ID and then guaranteed unique by XML parser.

I've seen invalid schemas/signatures where Id was just defined as string in the schema (fails when verifying using libxml/xmlsec for example)

jrpelkonen 19 hours ago | root | parent | prev |

I know next to nothing about .net, but this seems like the classic “you’re holding it wrong” excuse to me. If there’s a way to call an api the wrong way and the right way, and both appear to work, large number of developers will implement the insecure api. Why can’t the incorrect api be removed? I understand there’s pressure to support old client code but vulnerabilities should trump backwards compatibility.

SigmundA 18 hours ago | root | parent |

The incorrect api would be using GetXml and looking at the raw XmlElement and using select nodes or something vs using the GetIdElement on the SignedXML object itself, its not going to prevent you from looking at the xml document directly and do something incorrect but it gives you a correct helper method right next to CheckSignature to do the right thing.

I mean at some point you do have to understand the difference between xml and a specific schema of it and how its used in SAML, its not like xml elements are required to have a unique id attribute.

This isn't something you would call directly anyway unless you were writing your own SAML client, which isn't that hard but there are existing ones, here is a simple one that works well:

https://github.com/jitbit/AspNetSaml