Back to posts
  • Content Management Solution
  • Marketing Websites

The Best Feature of Sanity Datasets You Didn’t Know About

Daniel Becker
Daniel Becker
3 cabinet drawers, 2 being opened and revealing documents. 1 marked with the letters *.* is closed and a hovering cursor shows the forbidden sign

The Sanity Content Lake comes with a powerful feature for documents IDs that I didn’t know about. As this feature was the root cause of some unexpected (to me) functionality I want to write about it here. This feature is described in detail on the official documentation under the title Paths . Although I vaguely remember reading through this page at some point it did not save me from an unnecessary and long debug session.

So here’s a lesson on the powerful functionality hidden inside Sanity’s document IDs.

tl;dr

When prefixing a document id with a string + ., e.g. settings.tokens it will become a namespaced document and will be hidden to unauthorized clients – even in public datasets.

What are namespaced IDs Link to this headline

Every document in Sanity has an _id attribute by which it is identified. The format of this ID can be a string with a maximum length of 128 characters. The following characters can be used a-zA-Z0-9._-. Some more restrictions apply, more information can be found on the official Sanity documentation .

When using the . character in the _id a “path” is created. This path can be used to organize and query documents in a hierarchical way.

Querying namespaced IDs Link to this headline

The query language GROQ comes with a function to filter documents by their path. The following query will output all documents whose _id attribute starts with settings.

*[_id in path(settings.**)]

// example output:
// […] 4 items
// { _id: "settings.handles.socialmedia", … },
// { _id: "settings.tokens", … },
// { _id: "settings.tokens.socialmedia", … },
// { _id: "settings.tokens.hosting", … },

The asterisk * works similar to glob patterns and can be used as a wildcard for path segments.

*[_id in path(settings.*)]

// example output:
// […] 1 item
// { _id: "settings.tokens", … },

This seems to be working only for the end of the path, though. path(settings.*.socialmedia) won’t work.

Drafts as a special case of predefined namespaced IDs Link to this headline

When editing a document in Sanity studio it will create a copy of the document as a “draft”. All drafts are namespaced by the drafts path, e.g. the draft of a document with the ID test would be saved under the ID drafts.test.

For queries run using the raw perspective filtering drafts using the path() function works:

// in `raw` mode
*[_id in path(draft.**)]

// will return all documents in draft state

This changes however when switching into one of the other perspective modes.

// in `previewDrafts` or `published` mode
*[_id in path("drafts.**")]

// [] 0 items

The (almost) hidden feature Link to this headline

There’s another functionality for IDs with paths: all documents with an ID including a . are hidden by default. That means these documents won’t appear in any query sent to the Content Lake API from an unauthorized client. This is true for public datasets, too.

None of the queries above will return any settings.** document, when they’re being send from a client that is configured without a valid token.

Of course this functionality is shared with draft documents. Even before the introduction of perspectives, these weren’t available in public queries. I just never realized that this is true for all documents using “namespaced IDs”.

Internal Content Lake Functionality Link to this headline

The content lake uses namespaced IDs to manage internal information. The documentation mentions that IDs prefixed with versions. could interfere with platform functionality.

Running the following query will reveal information on the roles defined in the dataset:

*[_id in path(_.**)]

// […] 10 items
// { "_id": "_.groups.administrator", "_type": "system.group", … },
// { "_id": "_.groups.create-session", "_type": "system.group", … },
// { "_id": "_.groups.public", _type": "system.group", … },
// …

Escape Hatch Link to this headline

These roles also include a grants.filter array that holds the filters used to expose certain paths to authenticated roles like administrator or editor:

*[
  _id in path("_.**") &&
  count(grants[].filter) > 0
 ] {
  _id,
  "filter": grants[].filter
}

// […] 4 items
// {
//   "_id": "_.groups.administrator",
//   "filter": [
//   "_id in path(\\\\"**\\\\")",
//   "(_id in path(\\\\"**\\\\")) && _id in path(\\\\"drafts.**\\\\")"
//   ]
// },
// {
//   "_id": "_.groups.public",
//   "filter": [
//   "_id in path(\\\\"*\\\\")"
//   ]
// },
// {
//   "_id": "_.groups.sanity.editor",
//   "filter": [
//   "_id in path(\\\\"**\\\\")",
//   "(_id in path(\\\\"**\\\\")) && _id in path(\\\\"drafts.**\\\\")"
//   ]
// },
// {
//   "_id": "_.groups.sanity.viewer",
//   "filter": [
//   "_id in path(\\\\"**\\\\")",
//   "(_id in path(\\\\"**\\\\")) && _id in path(\\\\"drafts.**\\\\")"
//   ]
// }

On plans that “allow custom access controls” these filters can apparently be changed. As only enterprise plans allow for custom access controls (in the section “Security & Compliance”) I never had the chance to test this.

If you’re not on an enterprise plan you cannot easily expose namespaced IDs. Ways to query documents with namespaced IDs on public datasets are:

  • configure a Sanity client with an access token
  • if you don’t want to expose the access token to a (browser) client, you can add an API route to your setup to fetch the documents using a token

As far as I know there’s no other way to get access to these documents if they use a an id with a path.

Pitfalls Link to this headline

The way I found out about paths in IDs hiding documents from unauthorized clients was an annoying Friday evening debugging session. I was migrating a Sanity setup using the document-internationalization plugin in version 1 to its newest version >3.

I encountered two hard to debug issues in quick succession. First I was using content migrations in their new form, as I find these very nice to work with. When creating the metadata documents similar to how its done in the official documentation I omitted the _id attribute. The _id does not seem to be required when using the create function of the Content Lake API. I was testing my migration in small steps and resorted to using the createIfNotExists function of the API which does in fact require a document ID to be present (which makes total sense).

My migration included patches as well as new documents to be created and at first I couldn’t make sense where the error came from (although its message is fairly obvious).

# HTTPError: mutationError: Mutation failed: Missing document ID

Once I figured that out I added an ID to the translation documents:

const metadata = createIfNotExists({
  _id: `translation.${document._id}`,
  _type: "translation.metadata",
  translations: [
    {
      _key: document[LANGUAGE_FIELD],
      value: {
        _type: "reference",
        _ref: document._id.replace("drafts.", ""),
        ...weakRefs,
      },
    },
    ...refs,
  ],
});

The migration ran flawlessly now. The local environment was using a token and worked fine as well. At first sight the preview deployment appeared to be working, too! But when I wanted to switch the website’s language that I was working on things started to fall apart. The alternate languages for each page weren’t loading because the unauthorized client could not access the “hidden” translation.* documents.

This is definitely something to be aware of when using namespaced IDs. I won’t forget that this feature exists after a debugging session that took longer than I’d like to admit.

Applications for namespaced IDs Link to this headline

In our setups we sometimes have plugins configured for the Sanity studio dashboard. These plugins often connect to external services to fetch some data and persist it in the Sanity dataset for easy access from the frontend. Or they trigger functionality by calling a webhook. Both of these use tokens for authentication.

These tokens can be stored in the Content Lake using a namespaced ID like settings.tokens. This way they will be accessible to the dashboard plugin through the user authenticated client, but not leak to the public. Which is really handy.

Let me know if you know of any other use cases! I am eager to learn more about this functionality.

Author

Daniel Becker
Daniel Becker

Co-Founder, Head of Tech

Connect on LinkedIn

Similar articles