Lucas Gonze

Reply to posts using GitHub (https://github.com/lucasgonze/lucasgonze/discussions), email (lucas@gonze.com), or the guestbook (writing.gonze.com/guestbook).

What's Stopping Library Upgrades?

Big vulnerabilities in upstream dependencies can linger in deployed software long past the point when a patch is available. Maven estimates that 35% of Log4J downloads continue to pull the version with the world-famous vulnerability.

What's the cause? Why aren't developers applying patches?

National Ecosystems

If you follow the 35% link above you'll see that countries have characteristic exposure profiles. Taiwan is far and away the worst, and China next.

Taiwan and China share a language but not a government. Maybe the problem is security resources that fit common practices. Is Mandarin well supported by Dependabot and similar tooling? Are technology news sources for developers not covering Log4J? Is there cultural skepticism? Are different development platforms (e.g. Gitea instead of GitHub) popular there, and is there a difference in security resources?

The market share of unpatched L4J in a given country is not the same as the market share at a global scale. Taiwan is tiny - even 80% unpatched downloads would have less impact on the global numbers than 20% of a huge country like China.

The follow-up work here is a country-specific study of China and Taiwan. What's holding back patches may be obvious to developers in these places.

Maturity Levels

When a codebase is mature, there is more resistance to change.

From Beyond Metadata: Code-Centric and Usage-Based Analysis of Known Vulnerabilities in Open-Source Software:

In the early phases of development, updating a library to a more recent release is relatively unproblematic, because the necessary adaptations in the application code can be performed as part of the normal development activities. On the other hand, as soon as a project gets closer to the date of release to customers, and during the entire operational lifetime, all updates need to be carefully pondered, because they can impact the release schedule, require additional effort, cause system downtime, or introduce new defects.

How can patches specifically target mature codebases?

Mature software will have older code and will tend to use older library versions. The biggest issue is simply providing non-breaking patches for older library versions. The older a library the less its developers want to work on it, and the greater the chance that an upgrade will only be available with a major version upgrade.

What can the security community do? Encourage library developers to support old versions. Discourage breaking changes of any kind. Encourage application developers to give preference to libraries with a record of support for older versions.

Trustworthiness

There may exist a patch but it may not be well vetted. Every upgrade is a chance for something to go wrong. There may be new bugs and vulnerabilities.

Ways to ameliorate the problem:

  • Encourage and help with automated tests
  • Have a trusted third party certify updates
  • Discourage library providers who lack the resources to make trustworthy upgrades

Lack of Auto-Upgrade

The reach of automated vulnerability scanning and patching is probably still pretty low, at a guess.

Vendored (copied and pasted) code is hard to scan or upgrade. Not all languages have high-quality scanning and upgrading. The CI/CD infrastructure for automated scanning and patching is relatively new. Package repositories like Maven lack facilities to force upgrades.

Follow-ups:

  1. Study improvements to vendored code detection and upgrade
  2. Identify needless gaps in tooling. For example, improve Dependabot availability in Gitea.
  3. Collaborate with package repositories on forced upgrades (or discouragement of known-bad versions)

Vulnerability Fatigue

Developers may be skeptical of vulnerability reports. There is a never-ending stream of announcements, but the daily impact is low.

Vulnerabilities may be in libraries that aren't used in production. Vulnerable open source dependencies: counting those that matter found that "about 20% of the dependencies affected by a known vulnerability are not deployed, and therefore, they do not represent a danger to the analyzed library because they cannot be exploited in practice." Identifying whether a dependency is one of these can take a considerable amount of work.

To ameliorate this problem, security tooling can improve detection of which risks are not a factor in production.

Another source of fatigue is inflation. Developers get cynical about new vulnerability reports when they have ignored old ones without suffering harm.

To ameliorate this problem, there could be checks and limits on new reports. A report and patch should be accompanied by a score from the Common Vulnerability Scoring System Version 3.1 Calculator.

Structural Improvements to ERC-1155 Metadata

NFT metadata could easily be simpler to code, faster to load, less bug-prone, and easier to understand than with current specs. See example at the bottom. To comment, use GitHub Discussions.


ERC-1155 metadata has bits that are clumsy and inefficient.

  1. The decimals feature mixes presentation with data. ("The number of decimal places that the token amount should display - e.g. 18, means to divide the token amount by 1000000000000000000 to get its user representation"). This feature is a product design choice for front end developers. It's irrelevant to a contract and it relies on ultra-pricey on-chain real estate.

  2. The {id} interpolation feature violates the HATEOAS constraint of REST. The feature is defined as: "If the string {id} exists in any JSON value, it MUST be replaced with the actual token ID". Well designed APIs should use literal URIs generated by the servers that are generating the metadata JSON. (I bet the interpolation feature is not necessary.)

  3. The {id} interpolation feature reinvents the concept of server-side programming, like PHP. I doubt this was intentional. I think the data format was intended to be used within Solidity, and accidentally got incorporated into a new context.

  4. Localization support mixes non-localized data with localized data, has a weird extra field for a default locale, and requires a URI to be fetched for each locale. If locale-specific strings were inline, consuming these files would be faster and the code would be a simpler.

  5. Locales are defined by reference to https://cldr.unicode.org/, but this is not a list of locales, it is an organization that shepherds lists of locales.

  6. The example JSON in the published EIP has an obvious bug:

    "image": "https:\/\/s3.amazonaws.com\/your-bucket\/images\/{id}.png",
    

This is simply wrong. A ‘/’ character does not need to be escaped in JSON, and the resulting escaped strings are not legal URIs. The example value should be:

    "image": "https://s3.amazonaws.com/your-bucket/images/{id}.png",
  1. There is no way to map from one of these external JSON documents to the token that it is annotating, so there is no way to know if they can be deleted except by searching every NFT that ever existed.

A future spec would:

  1. Nuke the decimals feature
  2. Nuke {id} interpolation
  3. Put localized data inline
  4. Separate localized and non-localized fields
  5. Eliminate the default locale - this belongs in the user-agent
  6. Fix the example data
  7. Require a locale name to be a subtag in http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

For the purposes of robustness, I'd also like to have:

  1. Ability to map from the metadata back to the token, so NFTs don't get hooked up to the wrong metadata. This requires a token ID in the JSON.

New and refactored example:

{
  "imageLink": "https://stableurl.com/my-ping",
  "locales": {
    "en": {
      "name": "Advertising Space",
      "description": "Each token represents a unique ad space in the city."
    },
    "es": {
      "name": "Espacio Publicitario",
      "description": "Cada token representa un espacio publicitario único en la ciudad."
    },
    "fr ": {
      "name": "Espace Publicitaire",
      "description": "Chaque jeton représente un espace publicitaire unique dans la ville."
    }
  },
  "tokenID": "0x12f28e2106ce8fd8464885b80ea865e"
}

For comparison, see eip-1155.md#localized-sample


Future topics:

  • I haven't figured out the properties object, which is a can of worms.
  • If the URI in an ERC1155 ERC1155Metadata_URI is IPFS, it is useless unless it is mutable, and it is only mutable if it is IPNS. Therefore, IPFS URIs with a non-IPNS path should be strongly discouraged.
  • Review for best practices
  • Much of the above is not related to metadata. It is about any sort of mutable data with a 1:1 relationship to an on-chain entity.

OSS Contribution Log: Blog on Logs merged into OWASP

I have had a PR merged into OWASP for the first time, a new Attacks On Logs section in the Logging Cheat Sheet. Given how much trouble it can be to get a PR merged into a new project, it's good to get a win.

This grew out of the Blog On Logs entry here.

HTML Should Support Markdown. Seriously.

Markdown has massive adoption. It clearly meets a need. The cost of HTML's power is verbosity, and sometimes that's the wrong tradeoff:

  • More typing. Writing HTML is slow.
  • More visual noise. Reading HTML is hard.

Browsers should support Markdown natively. The syntax should be part of HTML. There should be no need for a shim to translate.

This is very practical:

  • There is no Markdown syntax that can't be represented in HTML.
  • Markdown-to-HTML conversion is easy to implement.
  • Security risks are low.

The only non-trivial task is enabling CSS, which would require a canonical DOM representation of Markdown.

To the standards-mobile! This is a task for the HTML WG. I searched the mailing list and (surprizingly) didn't find discussion.

If you like this idea, all you have to do is discuss it in social media:

Blog on Logs

I am brainstorming security requirements for system logging. Can you think of others? Are some of these too lame to bother with? Do you know of specific attacks that might be relevant?

You can reply using an issue or email.

(Update Feb 21: This is documented by OWASP as Log Injection and by CWE as CWE-117. That documentation includes well-defined threat models).

Confidentiality

Who should be able to read what? A confidentiality attack enables an unauthorized party to access sensitive information stored in logs.

  1. Logs contain PII of users. Attackers gather PII, then either release it or use it as a stepping stone for futher attacks on those users.
  2. Logs contain technical secrets such as passwords. Attackers use it as a stepping stone for deeper attacks.

Integrity

Which information should be modifiable by whom?

  1. An attacker with read access to a log uses it to exfiltrate secrets.
  2. An attack leverages logs to connect with exploitable facets of logging platforms, such as sending in a payload over syslog in order to cause an out-of-bounds write.

Availability

What downtime is acceptable?

  1. An attacker floods log files in order to exhaust disk space available for non-logging facets of system functioning. For example, the same disk used for log files might be used for SQL storage of application data.
  2. An attacker floods log files in order to exhaust disk space available for further logging.
  3. An attacker uses one log entry to destroy other log entries.
  4. An attacker leverages poor performance of logging code to reduce application performance

Accountability

Who is responsible for harm?

  1. An attacker prevent writes in order to cover their tracks.
  2. An attacker prevent damages the log in order to cover their tracks.
  3. An attacker causes the wrong identity to be logged in order to conceal the responsible party.

Liz Cheney's Heel-Face Turn

Cheney: I Do Not Recognize Those In My Party Who Have Abandoned The Constitution To Embrace Donald Trump

That seems familiar. Could it be the The heel-face turn?

When a bad guy turns good. The term "Heel Face Turn" comes from Professional Wrestling, in which an evil wrestler (a "heel") sometimes has a change of heart and becomes good, thereby becoming a "babyface". Magazines and other promotional material from the various wrestling leagues comment on various wrestlers' changes in alignment nearly as frequently as they cover events in the ring themselves.

That depends on who you ask:

The nature of Heel-Face Turn and Face–Heel Turn is subjective (one person's "seeing the light" is another person's "heartless betrayal or fall" depending on what group the individual is going to or leaving).

What do the other members of the league of supervillains think?

Republicans rebuke Liz Cheney in unprecedented moves

Oops. Maybe she has her tropes mixed up and thinks she's in one that ends better for her character.

In movies with more than one supervillain, it's usually only the villain that acts as the Big Bad that perishes; the lesser ones either are captured, reform, or return as the Big Bad in the sequel. --(Superhero Movie Villains Die)

To get a badge

Two years ago I did a lot of work related to badging (Example) using the Open Badges 2.0 standard. At the time I had little intuition about the value. Yesterday I got a certificate for completing a Linux Foundation course. It was surprisingly satisfying, so much so that I added it to my LinkedIn profile.

When I took the course, it was partly for the learning and partly for the badge. I want to be able to position myself as a subject matter expert, and both the learning and the credential are useful. The desire to acquire the badge validated my earlier assumption that badges do lead to action.

This badge does not use Open Badges 2.0 as far as I can tell. That standard appears to be stone cold dead. There was no mention of the standard anywhere in the process or visible code. What makes the badge valuable instead are the signatories and the branding. Hero text: "The Linux Foundation"; then, with facsimile signatures, "Clyde Seepersad, SVP & General Manager, Training and Certification The Linux Foundation" and "Kay Williams, Chair of the Governing Board Open Source Security Foundation (OpenSSF)."

The badges I was issuing wouldn't have been as effective. The signatories would have been missing. The underlying evaluation would be purely algorithmic.The branding would have been an unknown startup.

The badge I did receive is valuable enough that I paid for a course I could have taken for free, just because the certificate might be helpful for my career. Sharing a badge allows me to communicate that I have knowledge. Also, demonstrating completion of the coursework is relevant to The CII Best Practices badge, which has two tests that the certificate would influence:

  • The project MUST have at least one primary developer who knows how to design secure software.

  • At least one of the project's primary developers MUST know of common kinds of errors that lead to vulnerabilities in this kind of software, as well as at least one method to counter or mitigate each of them.

You get what you measure. I wanted to show success on the metric, so I went and got the knowledge.

Diversity, equity, and inclusion in open source, and Americanism

If you're working on diversity, equity and inclusion in open source and you're American, don't assume America. The world is big. Don't assume race as understood in the US is the central issue.

Every part of the world has their own hierarchy of privilege.

Diversity and Inclusion in Nigeria:

The issue of diversity has world wide relevance. As Chairman Mao Tse-Tung said: “Let a thousand flowers bloom”. However I believe, like most issues, diversity adopts different meaning and flavor, depending on the locality you situate it.

Although English is the official language, more than half of the population do not understand and or speak formal English. Pidgin English is often a means of reaching out to a significant portion of the population, but it has limited appeal in the Northern part of the country. ... There are two dominant religious groups in Nigeria, namely Moslems and Christians. Unless the workforce reflects the two religious groupings, it stands the risk of being identified as ‘belonging’ to one groups or the other. It also runs the risk of offending members of the religious groups, sometimes out of sheer ignorance.

Castes in India:

Indian Caste System

Discrimination in China:

Although 56 different ethnic groups are officially recognized in China, the nation remains fairly homogenous, with over 90% of its citizens belonging to the Han Chinese group. People from different ethnic backgrounds, as well as foreigners, consequently stand out and may sometimes face discrimination and racism in China.


All you can safely assume is that the other people in your project are smarter than you and will flip the bozo bit if you fail to see beyond your privilege.

OSS Work Log

CHAOSS: created a draft metric model for security facets of sustainability.

XSPF: went to add Tess Gadwa and Evan Boehs to "about" page, made the changes, got ready to push, realize I did this already two months ago.

As a potential consumer of an open-source package, I must judge whether it is likely to introduce vulnerabilities or require updates in order to patch vulnerabilities.

Pwsafe: donated $20 to the maintainer, Rony Shapiro (GitHub).

This is the second donation I have made in roughly ten years of using his software. Then I went and stalked him on GitHub and Sourceforge. Just as I suspected, he's been patiently devoted for decades. I felt grateful and privileged.

Privacy Regulations for OSS Dev

Introduction

This document is intended as a jumping-off point for people who need specifics about privacy regulations that affect open source development, whether in law or contracts. I can’t and shouldn’t offer legal advice. However, developers need to be able to educate themselves. This is a directory of resources.

This document is intended to evolve and grow. I invite you to contribute information on any jurisdictions you are familiar with.

Government

UN

Article 17 of International Covenant on Civil and Political Rights, 1966

EU

GDPR

Germany

Bundesdatenschutzgesetz

United States

Federal

U.S. Code § 552a
(NIST) Guide to Protecting the Confidentiality of PII

California

OPPA 2003

CCPA 2018

Contract Law

GitHub

Privacy Statement

Acceptable Use Policy

Linux Foundation

Telemetry Data Policy

Discuss

Hacker News
Twitter
CHAOSS
GitHub

Nobody Tells a Volunteer What To Do: Town Meetings and Open Source

The organizational structure of open source teams resembles town meetings - small-scale government with minimal hierarchy.

Huntington town meeting.jpg
(By Redjar, CC BY-SA 2.0, Link)

Wikipedia on town meetings:

A town meeting is a form of direct democracy in which most or all of the members of a community come together to legislate policy and budgets for local government. It is a town- or city-level meeting in which decisions are made, in contrast with town hall meetings held by state and national politicians to answer questions from their constituents, which have no decision-making power.

Town meetings have been used in portions of the United States, principally in New England, since the 17th century.

It's because open-source contributors are free in a way that employees aren't. They determine whether, when, and how to gift their time. If the project wants them - which is not always true - their voice will be valued from early on.

"Nobody puts baby in a corner":

No one with talent should be stopped from expressing it or showing it off. It’s about self-expression – about enabling anyone to be their best self, and a striking call against anyone who strives to keep people’s potential at bay.

The indigenous original-American critique of Europeans, per Graeber/Wengrow "Dawn of Everything", fits beautifully:

A “chief” had to persuade other members of the tribal council to get agreement on an action he thought was needed. He had to convince his fellows; he could not order them to obey. If he tried, tribal members were free to disobey.

Volunteer developers are free to disobey. Leaders must persuade rather than order.


At the same time, there are definite authorities, always.

Somebody owns admin privileges on the shared repo and site. Somebody knows the passwords. Somebody is recognized by outsiders as the voice and face. Somebody owns the domain.

Most importantly, somebody does a large enough proportion of work to dominate the shape of the project. And that goes to the most important force in open-source governance: the tyranny of volunteerism.

Nobody tells a volunteer what to do. A leader gives no orders. A leader helps a volunteer choose what to work on. A leader helps a volunteer learn how to be effective. A leader enables.


Robert's Rules of Order are often used to structure meetings:

Robert's Rules is the most widely used manual of parliamentary procedure in the United States. It governs the meetings of a diverse range of organizations—including church groups, county commissions, homeowners associations, nonprofit associations, professional societies, school boards, and trade unions—that have adopted it as their parliamentary authority.

They're valuable because they enable structure while accommodating widely varying social backgrounds:

A U.S. Army officer, Henry Martyn Robert (1837–1923) ... found San Francisco in the mid-to-late 19th century to be a chaotic place where meetings of any kind tended to be tumultuous, with little consistency of procedure and with people of many nationalities and traditions thrown together.

That "people of many nationalities and traditions thrown together" bit is the key. It fits open source perfectly.


People new to open source often find the pace of meetings maddening. It can be hard to know who, if anyone, is running them. Unlike a corporate environment, there are long pauses and uncomfortable silences. But if you are comfortable with silence, the gaps in such meetings can be very hard working.

Silence serves to draw in contributors. Whereas in a corporate meeting the highest-paid person does most of the talking, in an open-source meeting the leaders are those giving away the most labor. Knowledge and engagement are the coin. Silence serves to draw in comments from those who usually engage less. It sells them on contributing more. By voicing opinions they offer labor.

Silence also disrupts formal hierarchy. In a corporate meeting, the amount of talking each person gets to do corresponds with their salary.

Silence discourages overtalkers. It encourages listening. It makes time to contemplate what has already been said.


Quakerism - another historical Americanism - is relevant.

Quaker weddings are conducted similarly to regular Quaker meetings for worship, primarily in silence and without an officiant or a rigid program of events, and therefore differ greatly from traditional Western weddings.

The attendees gather for silent worship, often with the couple sitting in front of the meeting (this may depend on the layout of the particular Friends meeting house). Out of the silence, the couple will exchange what the Philadelphia Yearly Meeting describes as "promises",[3] and Britain Yearly Meeting describes as "declarations" with each other. The promises are short, simple, and egalitarian, and can vary between different regions and meetings.


Practices rooted in open government are an awkward fit for commercial open-source, such as COSS, OSPOs and Inner Source. Should meetings with commercial projects temporarily adopt open government? Can they?

Maybe commercial open source can thrive without open government. Maybe not. The person who pays the bills will have an outsize influence that can destroy the group's output. For a business to truly benefit from open sourcing, it must create social dynamics like a town meeting.


Comment on this:

Link Dump 1/17/2022

In the spirit of del.icio.us, here are some links I wanted to save without taking action and that might be useful to you as well as me.

https://woob.tech/: Web Outside of Browsers

a collection of applications able to interact with websites, without requiring the user to open them in a browser. It also provides well-defined APIs to talk to websites lacking one.

https://clearlydefined.io/: Helping FOSS projects be more successful through clearly defined project data.

ClearlyDefined and our parent organization, the Open Source Initiative, are on a mission to help FOSS projects thrive by being, well, clearly defined. Lack of clarity around licenses and security vulnerabilities reduces engagement — that means fewer users, fewer contributors and a smaller community.

https://threatrix.io/embedded-open-source: Our proprietary technology enables us to find embedded open source snippets in your software during build time.

This would be useful as part of an SBOM.

Community CRM Runbook a tool that provides data-based insights into your community's health

It aggregates disparate data pulled from sources such as GitHub, Slack, Twitter providing a unified view of your members. Community managers and developer relations professionals log and manage all their interactions with the members capturing keeping information from their entire teams. This doc summarizes best practices identified by some community CRM users.

Info Gathering for Next-Gen NFTs

These insightful Moxie Marlinspike comments on quote-unquote web3 are helpful to gather product requirements for a next-generation vision of NFTs:

Instead of storing the data on-chain, NFTs instead contain a URL that points to the data. What surprised me about the standards was that there’s no hash commitment for the data located at the URL. Looking at many of the NFTs on popular marketplaces being sold for tens, hundreds, or millions of dollars, that URL often just points to some VPS running Apache somewhere. Anyone with access to that machine, anyone who buys that domain name in the future, or anyone who compromises that machine can change the image, title, description, etc for the NFT to whatever they’d like at any time (regardless of whether or not they “own” the token). There’s nothing in the NFT spec that tells you what the image “should” be, or even allows you to confirm whether something is the “correct” image.

All this means that if your NFT is removed from OpenSea, it also disappears from your wallet. It doesn’t functionally matter that my NFT is indelibly on the blockchain somewhere, because the wallet (and increasingly everything else in the ecosystem) is just using the OpenSea API to display NFTs, which began returning 304 No Content for the query of NFTs owned by my address!

royalties aren’t specified in ERC-721, and it’s too late to change it, so OpenSea has its own way of configuring royalties that exists in web2 space.

Turning these into requirements:

  1. There should be non-transient addressing for the asset. The next-generation vision of NFTs principle addresses this as a goal, but needs tuning to actually carry it off.
  2. The blockchain community must not accept OpenSea's version of your wallet as your wallet. Truly decentralized approaches are absolutely possible, using shared code and standardized protocols rather than centralized monolith APIs.
  3. Open source libraries and protocols must be extended to cover OpenSea's proprietary features, include royalties.

Enter writing.gonze.com

I have created a custom domain for my listed blog, so I can own this should it wind up mattering.

The URL of this post is https://writing.gonze.com/31203/enter-writing-gonze-com

Hello Listed World

This is my very very first post on https://listed.to. The URL of this post is https://listed.to/authors/19771/posts/31190

Welcome to me!