Cattle Not Pets

When I first heard the term “cattle not pets” it was the perfect metaphor to describe a concept I had always been aware of when developing for the cloud, but never had the words to describe. The idea that you should remove individuality from your cloud infrastructure and treat your resources as nameless and dynamic like cattle. Resources come and go all the time so there is no time to name, feed, and care for them as if they were pets.

I’m sure many of us have been somewhere that has a fleet of servers named after superheroes, Disney characters, or something exceedingly nerdy like Dr. Who villains. When we start talking about scalability, though, characters can’t be imagined fast enough. Not to mention the hand feeding required to spin up new instances of an application over and over again. As we were developing our cloud infrastructure to scale for Muserk, our first goal was to never connect directly to an instance again. This felt like a great starting point to answer the question of how do we deploy applications, manage state, and debug issues that arise. This is mostly a qualitative look at how we began to scale our operations in the early days of Muserk…. So for you super nerds out there we won’t go into detail about things like load balancing, caching, or infrastructure as code.

DEPLOYING APPLICATIONS

Probably the most important aspect of scaling is being able to deploy an application programmatically. Once we can do that everything else is just a facility. The obvious answer here is Docker. The more advanced answer involves Kubernetes or Terraform, but that’s a topic for another day. With a containerized application we can control dependencies, versions, the operating system, and any configuration that needs to be done ahead of time. So, all we need is a platform to run our container. The advantage of this is that this platform can be anything! The container will run exactly what we need, the exact same way, anywhere that can support docker. Once the process of starting one of these containers is automated, we are free to start up as many as we would like, allowing a load balancer to route traffic appropriately.

MANAGING STATE

Next there is the problem of how to manage state on a server instance that is essentially garbage. Writing to disk is out of the question because all of that information would be lost from instance to instance. Well, what about NFS? This could be a plausible solution, but too slow without provisioned IOPs (which are expensive in the cloud). Besides, we should do better!

In fact, this was the starting point for really honing our data model and forced us to come up with a first pass at some sort of ETL. As we ingest data, how do we store it so that our applications can access it in a consistent way? Once all of our data is in one place, we can use it as our Single Source of Truth. Using a database as a SSOT is its own complexity. The real lesson for managing state across a scalable infrastructure is to AVOID state when you can.

DEBUGGING ISSUES

Most commonly, the reason for needing to log into an individual instance is typically to figure out what went wrong. As resources start to scale this gets increasingly difficult anyway because an error could have occurred on any one of 4, 10, or, theoretically, n number of instances. So how do we figure out where problems are happening and how to fix them? There are all sorts of things to monitor across our applications. User experiences, resource trends, load times, are a couple of examples. Most importantly, in my opinion, are the error logs.

When an error occurs, we want to be made aware of it. At first pass you should be using a logger. A logger lets us standardize how we create new logs by assigning a category for each type of log and ordering them by severity. Some common categories include DEBUG, INFO, and ERROR. In this example, DEBUG level logs may be information that would be helpful when figuring out what happened, but not necessary to be looked through all the time. INFO-level logs are adding a bit higher severity. These are messages we may always want to see so that we can see usage in real time. ERROR logs, being the most severe, can be alerted on. We can configure our system to report when an ERROR has been logged so that we can take immediate action. We can then use the INFO and DEBUG logs to determine what happened. If we’ve done it correctly these logs will have information on the unique machine the application is running on so we can handle hardware-specific problems. Once we are collecting logs from all machines across all applications, we can begin to build dashboards around each application. Combined with usage and hardware metrics, we have a central location to view all relevant information.

I hope this was in some way helpful for thinking about your own cloud infrastructure.  As we continue to improve our architecture, we hope to have more to share. We are evolving our technology every day and are working hard to improve our ETL workflows and integrations into the substantial amount of processing we are doing with the data we generate. In the meantime, we will continue to backfill posts with what we have learned and implemented along the way of this journey into the final frontier.

Product Innovation In Music

When we think of product innovation in the music industry, much of the focus centers on new ways to create music and new ways for fans to consume it.  As far back as piano rolls, the idea of having a machine play your favorite music in your own home was amazing, this iterated into piano rolls and player modules that better captured and reproduced the nuances of a better performance (eg. dynamics, attack, etc.). Welte-Mignon brought the public Debussy playing Debussy the way Debussy intended! This natural progression can be seen throughout the industry’s history with the phonograph, radio, film, television, cassette to cd, streaming, and so on.  As new technology brought us more music to listen to, the business and administration side has always reacted with ways to properly manage and exploit rights; mechanical royalties, performance rights societies, licensing of rights for new media just to name a few. However, with all of the progress on the macro level of the business side, there is still a lot of room for innovation. The challenge is how do we productize the business and administration side of the industry with the same level of innovation that the production and consumption of music has been given? Let us explore….

What is a Product in the first place?

For the sake of this piece, let’s call a product an aggregation of parts & commodities, packaged with a brand, in order to create a usable, productive, satisfying experience, which is then sold. Huh? Maybe this is better explained with an example:

A bunch of steel, 3rd party machined parts, tires, a computer, all wheel driving system, etc. is “an aggregation of parts and commodities”that could make an SUV.

The SUV “packaged with a brand,” say Subaru, could bring us one of several Subaru SUV models which are likely creating an image in your head of who buys it and where they drive it.

The ability to drive the car (usable), to your destination, through the woods with a mountain bike on the back, in the snow with skis on the roof, to the beach with a surfboard (productive), all the while remaining content because it never fails, and you didn’t pay the price of an Audi.

So for music, a product such as the Spotify® or Apple Music® app could be described as the aggregation of: licensed music (the commodity) delivered through graphical user interface that sits on a robust tech infrastructure (some parts) that gets people to the music they want, or never knew they wanted, with ease (the brand, usability, experience, etc.). Taking any product through the same exercise starts to reveal why one is chosen over the other, why some brands are more successful than others. During product development at Muserk, we use this exercise as a means to identify the real issues at hand and begin the process of building the solutions that address them head-on. So let’s now dive into what we can do about innovation in the music business.

Don’t confuse pain points with problems

At Muserk, we remain focused on the root of industry challenges.  We strive to stop identifying broad pain points and pretend they are the problem. They are, in fact, the symptom. Instead, we break them down and arrange them within a respective process. There is no shortage in our industry of pinpointing issues in copyright, royalty collections, lack of transparency and so on. Anyone who has attended a music industry panel knows we are great at identifying issues in the industry, and that is a good thing. Unfortunately, sometimes panels can be on parity with a New York City co-op shareholder meeting where people complain about the same things as last year and offer no practical solutions. At times solutions can even be met with ire as jaded individuals equally dread both their current circumstances and the change that could do them some good. Alternatively, taking the pile of identified pain points, breaking them down to their individual problems, and arranging them in an order to identify their place within a process will uncover their effect on a crucial dependency (i.e. a very important point of failure). It is a focus on these real issues that lead us to effective solutions. For example, with personal finance “I don’t have enough money” is a problem – really a symptom – we all face at one point or another.  When broken down it is because of an income problem or spending problem. If it’s an income problem, it could be because your gross pay rate is too low, or the number of hours/days to which the rate is applied are too low. Or maybe your net pay is being affected by a wage garnishment, or another withholding. Your problem may even pivot. You might find that a pre-tax retirement contribution is affecting your net income and doesn’t support your eating out/coffee habit. You decide the compound interest over 30 years is in fact better than the convenience of not making your lunch/coffee, and your perceived income problems are in fact a spending problem. Solution: make your own damn coffee and lunch!

Let’s identify and arrange

One major pain point in the industry that we, Muserk, alleviate is:

…having no idea how much a rights holder should be making but there is a strong feeling it should be more…

This pain point  is really a symptom of a whole bunch of underlying issues embedded in a ton of processes that span across the globe. The recording and publishing industries consist of many data pipelines, supply chains, royalty streams fragmented by rights types, mediums, platforms, and licensing configurations. These royalty streams are subject to the business practices, copyright laws, and capabilities of local markets. Constantly emerging music platforms create new royalty streams with new licensing configurations, which adds complexities to an already-complex industry. So, if we are to make meaningful products to fix this let’s start to list out these big little problems that make up the overarching issue:

  1. Music is on a DSP but not being played
  2. Music isn’t attached to my artist page
  3. Not getting paid for cover versions of your songs
  4. Having no way to find ISRCs linked to your compositions
  5. It’s been a year since a release and songwriter splits have not been decided
  6. Being able to collect PRO money for radio play, but no mechanicals from DSPs
  7. Being able to collect for U.S. activity but nothing internationally
  8. Being able to collect for all activity but the U.S.
  9. Having no idea how to register your works to receive publishing money
  10. Acquiring a catalog of masters but there is no useful metadata
  11. Not getting paid for remixes
  12. Unable to find 473 works in the 22 million rows of usage data provided by a music platform

 …we could go on forever before we even mention deal terms and royalty rates which is where most everyone looks first

Start arranging

As you arrange in the order of a process, you will begin to find that one problem may be a result of another before it or it may be creating a bigger issue further down the line. For example, there may be a solution for automated global delivery of works data to help with compiling and delivering data, but that ends up efficiently populating databases around the world with incorrect information. Oops! The challenge is anticipating the potential roadblocks that may appear when fixing things in one area only to uncover issues in another.  When developing solutions in this way you soon discover that the most complex of them can be solved one issue at a time.  It is this combined with an agile approach that so many software development teams use when building. Moving on….

Fix the stuff you know, take cues from other non-music industries for everything else

Everyone agrees that a fresh perspective can uncover new and innovative ways to approach problems. Anyone who has spoken with me about this knows I am a broken record when it comes to finding solutions outside of our industry (pun kind of intended). So, with the example from above, there is a data input issue and a delivery issue; 2 steps of a much longer process. Normally, we look for solutions within our industry. However, Muserk has found that looking at industries that have great success in fixing similar issues can be quite helpful. With the data input issue think about ecommerce and the checkout process. If you have ever purchased an item online, you navigate to your shopping cart, entered an address, CC info, shipping, etc, you have completed a process that companies spend a lot of resources and money tweaking and figuring out. Companies hate abandoned shopping carts and do whatever they can to ensure you complete your purchase. If we want to ensure music metadata is accurate going into our systems, perhaps the eCom shopping cart industry is on to something in terms of design, UX/UI, information gathering, etc.. There are solutions to countless problems already solved in other industries.

Building something truly useful, and constant iteration

By now, we can all agree that rights administration consists of many linear processes that are each subject to many points of failure. It is important to always know that one improvement somewhere can amplify a deficiency elsewhere. Or that a high performance feature in one area can be rendered useless by a weak link down the chain. There’s no reason to put a jet engine in a small prop plane if it is just going to tear the whole plane apart on takeoff. At Muserk, we realize building a single feature is useless without the infrastructure to support it, a workflow in which it integrates, or the personnel to use it. MMatch, Muserk’s AI matching technology that discovers sound recording/musical work links, at proof-of-concept required a Sr. developer and a tech savvy rights manager to run; not what we would call a scalable solution. Not until there was a UI for a rights manager to input various data formats and automated steps that made MMatch’s output data usable was the technology more accessible to everyone on the team and therefore used more often. Now that productivity shot through the roof, do we stop there? No. At this point we are ready to iterate and not be outpaced by industry demand. The tech world can be harsh in this regard. A product version or feature  can go from “beta” to “deprecated/obsolete” in a handful of years or less. Once MMatch became more widely used by the team, the rights management cheered the sudden ability to complete days of work in under an hour, but understood this vast improvement shifted the bottleneck from usage discovery to staging data and the subsequent analysis that followed MMatch. It is true that days of work were gone, but why stop there? Why not apply MMatch to other use cases? Or better yet, why not make this one incredible product, one of several “parts & commodities, packaged with a brand, in order to create a usable, productive, satisfying experience?” It is at this point where product development resembles a continuous cycle or even an expansive spiral. Innovation in one area of an industry or other industry that once served as an effective stand-alone solution then becomes a lynchpin for a larger, future product.  

So let’s review. We have gone from simply airing grievances to identifying problems. From understanding these problems’ effect on the bigger picture to ideation of real solutions, and so on. The music business side of the industry has a long way to go in terms of technology, but I personally look forward to Muserk being a part of the massive amount of innovation we will see in the future and helping modernize the music industry.

Cattle Not Pets – Configuring Scalable Resources

When I first heard the term “cattle not pets” it was the perfect metaphor to describe a concept I had always been aware of when developing for the cloud, but never had the words to describe. The idea that you should remove individuality from your cloud infrastructure and treat your resources as nameless and dynamic like cattle. Resources come and go all the time so there is no time to name, feed, and care for them as if they were pets.

I’m sure many of us have been somewhere that has a fleet of servers named after superheroes, Disney characters, or something exceedingly nerdy like Dr. Who villains. When we start talking about scalability, though, characters can’t be imagined fast enough. Not to mention the hand feeding required to spin up new instances of an application over and over again. As we were developing our cloud infrastructure to scale, our first goal was to never connect directly to an instance again. This felt like a great starting point to answer the question of how do we deploy applications, manage state, and debug issues that arise. This is mostly a qualitative look at how we began to scale our operations in the early days of Muserk so we won’t go into detail about things like load balancing, caching, or infrastructure as code. If you’re looking for that type of thing stay tuned!

Deploying Applications

Probably the most important aspect of scaling is be able to deploy an application programatically. Once we can do that everything else is just facility. The obvious answer here is Docker. The more advanced answer involves Kubernetes or Terraform, but that’s a topic for another day. With a containerized application we can control dependencies, versions, the operating system, and any configuration that needs to be done ahead of time. So all we need is a platform to run our container. The advantage of this is that this platform can be anything! The container will run exactly what we need, the exact same way, anywhere that can support docker. Once the process of starting one of these containers is automated, we are free to start up as many as we would like allowing a load balancer to route traffic appropriately.

Managing State

Next there is the problem of how to manage state on a server instance that is essentially garbage. Writing to disk is out of the question because all of that information would be lost from instance to instance. Well, what about NFS? This could be a plausible solution, but too slow without provisioned IOPs (which are expensive in the cloud). Besides, we should do better!

In fact, this was the starting point for really honing our data model and forced us to come up with a first pass at some sort of ETL. As we ingest data, how do we store it so that our applications can access it in a consistent way? Once all of our data is in one place we can use it as our Single Source of Truth. Using a database as a SSOT is its own complexity. The real lesson for managing state across a scalable infrastructure is to AVOID state when you can.

Debugging Issues

Most commonly, the reason for needing to log into an individual instance is typically to figure out what went wrong. As resources start to scale this gets increasingly difficult anyway because an error could have occurred on any one of 4, 10, or, theoretically, n number of instances. So how do we figure out where problems are happening and how to fix them? There are all sorts of things to monitor across our applications. User experiences, resource trends, load times, are a couple of examples. Most importantly, in my opinion, are the error logs.

When an error occurs we want to be made aware of it. At first pass you should be using a logger. A logger lets us standardize how we create new logs by assigning a category for each type of log and ordering them by severity. Some common categories include DEBUG, INFO, and ERROR. In this example, DEBUG level logs may be information that would be helpful when figuring out what happened, but not necessary to be looked through all the time. INFO-level logs are adding a bit higher severity. These are messages we may always want to see so that we can see usage in real time. ERROR logs, being the most sever, can be alerted on. We can configure our system to report when an ERROR has been logged so that we can take immediate action. We can then use the INFO And DEBUG logs to determine what happened. If we’ve done it correctly these logs will have information on the unique machine the application is running on so we can handle hardware-specific problems. Once we are collecting logs from all machines across all applications we can begin to build dashboards around each application. Combined with usage and hardware metrics, we have a central location to view all relevant information.

I hope this was in some way helpful for thinking about your own cloud infrastructure. We have come a long way in the past several years, and still have a long way to go. As we continue to improve our architecture we hope to have more to share. In. the meantime we will continue to backfill posts with what we have learned and implemented along the way.

Thoughts from a Working Musician In Nashville

“Nashville has a long history of songwriting.”

This was something that I heard over and over when I moved here in the fall of 2017. At the time, I didn’t understand that this statement was actually an insight into how the music industry operates. To me, the word “songwriter” wasn’t much different from the word “artist” or “musician.” I had grown up playing songs by my favorite artists as well as writing and performing my own songs. It was all music to me. It wasn’t until later that I realized the music industry operates on some very clear distinctions – particularly in Nashville.

One of the first shows I went to in Nashville was at a down home type venue called Belcourt Taps. The show was an “in the round” style showcase where four different songwriters sat on stage side by side and took turns playing a song they had recently written. I had never encountered this type of show in Austin where I had moved from, but I got the sense it was standard practice here. To my surprise, one of the songwriters, in particular, was a very bad musician. His guitar playing was filled with missed notes and he struggled to sing in tune. But what was fascinating, was he didn’t seem to care at all. He was more interested in the audience, trying to gauge their reaction to his songs. I quickly realized that he had no interest in performing these songs on his own. His goal was to refine his songs down to their most entertaining form – that three-and-a-half-minute gem. It reminded me of how a comedian works on a joke over and over until he or she gets it just right. This was my first insight into how the music industry makes a clear distinction between artists and songwriters.

About a year later, I began working for Muserk as a software developer. Muserk is a global rights administrator that leverages technology to perform its duties with an exceptional level of speed and scale. I was intrigued by the opportunity to combine my tech career with my love of music. Furthermore, it was a chance to learn more about the business side of the music industry, something I thought would be useful in my own music endeavors.

As soon as I started, I was thrown into (sink or swim as the saying goes) the very complicated world of rights management. One of my first projects was developing what would later be known as M-Match — our proprietary AI technology used for finding works in the vast ocean of DSP data. Through this I then learned the intricacies of one way the music industry makes money.

The music industry makes money from two copyrights: one for the underlying work or composition and the other for a sound recording. In practice, there are two types of businesses that form around this: publishers (songwriter/work) and labels (artist/sound recording). So, if you have a song playing on Spotify, let’s say, a portion of the money that is generated from that song should find its way to the label/artist and a portion should find its way to the publisher/songwriter/s. You may think (as I did) that a company like Spotify would know all of this in advance and take care of it. That is not the case.

One of the big problems is that the label and publishing worlds don’t really talk to each other. So, a label will push a song to Spotify and not provide (and in some cases even know) any information about the underlying songwriters. Therefore, Spotify won’t know where to send the publisher/songwriter portion of the money. This is a fairly simplified but accurate account of what happens.

This is where Muserk shines. We spend most of our time matching songwriter related metadata to sound recordings so that we can collect and distribute the appropriate royalties. In the age of digital music this isn’t an easy task. We use all kinds of technology, processes, and insight to match as much data as possible. We’re constantly trying to innovate so that we can match works fast, accurately and at scale. I spend most of my time building this technology and creating ways to convey its results. I feel proud knowing that the work I do contributes to getting musicians paid what they’re owed.

As a musician, my time at Muserk thus far has opened my eyes to how the music world really works. I’ve learned that businesses dedicate themselves entirely to very small pieces of the industry. In Nashville, for instance, there are networks of people that are just trying to write the next hit song and could care less about recording or performing it. Concurrently, there are networks of people trying to be the next big artist and could care less about writing their own songs. For me, I’m still trying to figure out where I fit in. But having a broader understanding of the industry as a whole I know will help me navigate my own musical journey. And, of course, my metadata will be correct.

the next hit song and could care less about recording or performing it. Concurrently, there are networks of people trying to be the next big artist and could care less about writing their own songs. For me, I’m still trying to figure out where I fit in. But having a broader understanding of the industry as a whole I know will help me navigate my own musical journey. And, of course, my metadata will be correct.

Automated Copyright Administration – Where Technology Meets Process

In the last two decades, we have watched the music industry explode with innovation. Today, all that is required to listen to nearly any piece of musical content is an internet connection. Likewise, the basic tools necessary for creating said content now come pre-packaged with most computers. The average songwriter harnesses the power to make their work available to the world overnight. These simple facts are often taken for granted, as is the value we assign to music and those who are responsible for bringing it to our devices. In the wake of an unprecedented rush of content, the industry is tasked with making sure that all songwriters are accounted for in a timely, efficient, and accurate manner. Failing to do so will only further validate the feelings of distrust and skepticism that many artists hold towards the music business.

An unfortunate consequence of the modern royalty collection ecosystem is a prioritization of the high-earning works. Our proprietary matching technology called MMatch® enables Muserk to treat long-tail works with the same weight as top-earners. Through a combination of MMatch® and standardized processes, Muserk is capable of handling a massive volume of data while simultaneously minimizing the opportunity for human error. This value-agnostic approach allows us to treat all royalties with the same priority.

Identifying usage has surpassed the scale at which humans alone can reasonably achieve adequate results. Every month tens of millions of sound recordings are streamed on music services in the United States alone. In the beginning, to even reach a starting point where a person could begin analyzing potential matches, they were left to rely upon the only common composition-level data point that exists in most usage data- an ISWC. This means that songs lacking an International Standard Musical Work Code could not receive the attention they deserve, since matching on title or writer alone produces disastrous results. Besides the limitations on accuracy that are inherent with such a process, scaling to a global marketplace could simply not be achieved. With these obstacles in mind, Muserk began developing the MMatch® technology, which is capable of evaluating the relationship between text-based data points such as titles, writers, and artists. Muserk’s data pipelines have evolved to match the work once done by hand and have been enhanced with the capability to handle a wider range of data points. 

A common criticism of employing artificial intelligence to overcome obstacles in royalty collection is that one cannot be entirely sure that a link has been accurately identified. At Muserk, we recognize the truth in this sentiment by acknowledging the critical stages of human analysis that occur prior to pushing newly discovered data to the DSPs. In any industry, technology is meant to aid one in the ability to perform their job. Just like doctors do not rely on heart monitors alone to save the lives of their patients, a rights administrator cannot solely use any piece of software to confidently collect on behalf of their rights holders. While we cannot completely eliminate human interaction in the rights collection process, our data pipelines help us reduce our input to near zero. 

We are evolving our process with every iteration by continuously targeting our biggest bottlenecks and identifying how information can help us make better decisions. With a desire to do more with less we are motivated to continue to reduce the workload required by humans to collect royalties. As the methods through which music enters the marketplace continue to evolve, Muserk will remain an instrumental player in shaping the narrative of modern rights management.