Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

mlops.community meetup - ML Governance_ A Practical Guide.pptx

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 33 Anuncio

mlops.community meetup - ML Governance_ A Practical Guide.pptx

Descargar para leer sin conexión

ML Governance is often discussed either in abstract terms without practical details or using detailed AI ethics examples. This talk will focus on the day-to-day realities of ML Governance. How much documentation is appropriate? Should you have manual sign-offs? If so, when and who should perform them? When is an escalation needed? What should a governance board do? What if you are in a regulated industry? How can MLOps help? And most importantly, what is the point of all this governance and how much is too much? This talk will show how each organisation can best answer these questions for their own context by referring to examples and public resources.

ML Governance is often discussed either in abstract terms without practical details or using detailed AI ethics examples. This talk will focus on the day-to-day realities of ML Governance. How much documentation is appropriate? Should you have manual sign-offs? If so, when and who should perform them? When is an escalation needed? What should a governance board do? What if you are in a regulated industry? How can MLOps help? And most importantly, what is the point of all this governance and how much is too much? This talk will show how each organisation can best answer these questions for their own context by referring to examples and public resources.

Anuncio
Anuncio

Más Contenido Relacionado

Similares a mlops.community meetup - ML Governance_ A Practical Guide.pptx (20)

Más de Ryan Dawson (14)

Anuncio

Más reciente (20)

mlops.community meetup - ML Governance_ A Practical Guide.pptx

  1. 1. © 2021 Thoughtworks | Confidential ML Governance: A Practical Guide Ryan Dawson Principal Data Consultant ryan.dawson@thoughtworks.com Meissane Chami Lead ML Engineer meissane.chami@thoughtworks.com
  2. 2. © 2021 Thoughtworks | Confidential 1) Why ML Governance is Confusing 2) An Actionable View of ML Gov 3) Documentation in ML Gov 4) Working out the Details 3 Photo by Bich Tran from Pexels
  3. 3. © 2021 Thoughtworks | Confidential Why is ML Governance so Confusing? 4 ML GOVERNANCE Ethics Responsible AI AI for Good Ethics Principles Governance Frameworks Model Risk Management Graphic Insipred by https://www.growthbusiness.co.uk/why-governance-must-be-a-priority- for-startups-2550207/ Bias Fairness Transparency Privacy Security MLOps Reproducibility Best Practice Documentation Audit Legal Governance Board Sign-offs
  4. 4. © 2021 Thoughtworks | Confidential Governance Requires us to Stop and Think ● Many teams have little governance ● Tech teams have a delivery focus ● Build now, worry about risks later ● Burden can’t all be on tech ● Requires a collaborative process 5
  5. 5. © 2021 Thoughtworks | Confidential 6 ML GOVERNANCE Ethics Responsible AI AI for Good Ethics Principles Governance Frameworks Model Risk Management Graphic Insipred by https://www.growthbusiness.co.uk/why-governance-must-be-a-priority-for-startups- 2550207/ Bias Fairness Transparency Privacy Security MLOps Reproducibility Best Practice Documentation Audit Legal Governance Board Sign-offs
  6. 6. © 2021 Thoughtworks | Confidential 7 ML GOVERNANCE Ethics Responsible AI AI for Good Ethics Principles Governance Frameworks Model Risk Management Graphic Insipred by https://www.growthbusiness.co.uk/why-governance-must-be-a-priority-for-startups- 2550207/ Bias Fairness Transparency Privacy Security MLOps Reproducibility Best Practice Documentation Audit Legal Governance Board Sign-offs Ethics and Principles Tech Practices and MLOps Management and Frameworks
  7. 7. © 2021 Thoughtworks | Confidential Ethics is Dominating the Conversation ● Ethics and Responsible AI are important ● But only part of the conversation ● Must not neglect the boring stuff ● Boring stuff is core of good governance 8
  8. 8. © 2021 Thoughtworks | Confidential Perception of ML Governance 9 Ethics Governance Board Bias Sign-offs Bureaucracy Misuse Privacy Security Ethics Principles MLOps Explainability
  9. 9. © 2021 Thoughtworks | Confidential Better View of ML Governance 10 Best Practice Documentation Oversight Board Decision-making Peer Review Ethics MLOps Handover Guidance
  10. 10. © 2021 Thoughtworks | Confidential 11 Slide template based on TearDrop By PresentationGo Data Documentation Data Quality Data Lineage Data Labelling Data Access Policies Data Retention Data Security Data Architecture Data Management Data Integration ML Risk Management ML Best Practice Responsible AI ML Governance Board Model Documentation MLOps Data Governance ML Governance
  11. 11. © 2021 Thoughtworks | Confidential The Boring Side of ML Governance ● How much documentation is appropriate? ● Should you have manual sign-offs? ● If so, when and who should perform them? ● When is an escalation needed? ● What should a governance board do? ● What if you are in a regulated industry? ● How can MLOps help? ● And most importantly, what is the point of all this governance and how much is too much? 12
  12. 12. © 2021 Thoughtworks | Confidential How Much is Too Much? ● Lots of manual checks on code and data will make for slow process ● Team morale may be affected ● Process may not be followed 13 Porridge and bear images public domain from OpenClipArt
  13. 13. © 2021 Thoughtworks | Confidential Aside: Bureaucracy = Rule by Desks 14 Desk and Throne public domain via openclipart Crown Wissenschaftler-Uni, CC BY-SA 4.0 via Wikimedia Commons
  14. 14. © 2021 Thoughtworks | Confidential An Actionable View of ML Governance 15 Model card Model purpose, design, data description, risks Model Developer Model Validator Review initiated when model ready for production Model validation report Establish clarity, reproducibility, best practices Model owner approval Sign-off on clarity, monitoring plan, risks Model Owner Changes from review
  15. 15. © 2021 Thoughtworks | Confidential Model card Model purpose, design, data description, risks Model Developer Model Validator Review initiated when model ready for production Model validation report Establish clarity, reproducibility, best practices Model owner approval Sign-off on clarity, monitoring plan, risks Model Owner Changes from review
  16. 16. © 2021 Thoughtworks | Confidential Model card Model purpose, design, data description, risks Model Developer Model Validator Review initiated when model ready for production Model validation report Establish clarity, reproducibility, best practices Model owner approval Sign-off on clarity, monitoring plan, risks Model Owner Changes from review
  17. 17. © 2021 Thoughtworks | Confidential Model card Model purpose, design, data description, risks Model Developer Model Validator Review initiated when model ready for production Model validation report Establish clarity, reproducibility, best practices Model owner approval Sign-off on clarity, monitoring plan, risks Model Owner Changes from review
  18. 18. © 2021 Thoughtworks | Confidential Model card Model purpose, design, data description, risks Model Developer Model Validator Review initiated when model ready for production Model validation report Establish clarity, reproducibility, best practices Model owner approval Sign-off on clarity, monitoring plan, risks Model Owner Changes from review
  19. 19. © 2021 Thoughtworks | Confidential Model card Model Developer Model Validator Model validation report Model owner approval Model Owner Oversight board Escalation possible Escalation possible Escalation possible
  20. 20. © 2021 Thoughtworks | Confidential Model card Model Developer Model Validator Model validation report Model owner approval Model Owner Oversight board Escalation possible Escalation possible Escalation possible Oversight board may also lead a periodic review/audit process Cycle image public domain by OpenClipArt
  21. 21. © 2021 Thoughtworks | Confidential Place Decisions with the Right Roles/Guardians Model Developer Model/ Product Owner ● What does this model do? ● How does it work? ● What risks does it have? ● How best to monitor it? ● Which product/quality risks are worth taking? ● Which mitigations are worth the extra time and effort? ● Sign-off on serious risks ● Is it ok to use sensitive/PII data for this case? ● Where should we be improving gov/ML as an org? Governance Board
  22. 22. © 2021 Thoughtworks | Confidential Model Validator is Part of This Too Model Developer Model Validator ● What does this model do? ● How does it work? ● What risks does it have? ● How best to monitor it? ● Was the development process robust? ● Has the developer overlooked anything in best practice or risks?
  23. 23. © 2021 Thoughtworks | Confidential Documentation in ML Governance 24 24 2 4 2 4 2 4 2 4
  24. 24. © 2021 Thoughtworks | Confidential Checklists ● Google model cards ● ‘Datasheets for datasets’ ● Meta/Facebook reproducibility checklists ● ‘The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction’ ● ‘Towards Yet Another Checklist for New Datasets’ ● ML Cards for D/MLOps Governance by Ian Hellstrom Dog Breed Classifier image from Google 2 5 2 5
  25. 25. © 2021 Thoughtworks | Confidential Varieties and Purpose of Model Cards ● “Under what conditions does the model perform best and most consistently? Does it have blind spots? If so, where?” ○ Dog Breed Classifier: “What kind of photos tend to yield the most accurate results? Can it handle partially obscured dogs? What about dogs that are extremely close, extremely far away, or seen from unusual angles?” ○ Language Translator: “guidance around jargon, slang and dialects, or measure its tolerance for differences in spelling” ● Card should give an overview of internals and limitations of model with a view to how it will be used. ● Is an open documentation format. Not a process. ● Focused more on model than data (will return to this) 27 https://modelcards.withgoogle.com/about
  26. 26. © 2021 Thoughtworks | Confidential Model card Model purpose, design, data description, risks Model Developer Model Validator Review initiated when model ready for production Model validation report Establish clarity, reproducibility, best practices Model owner approval Sign-off on clarity, monitoring plan, risks Model Owner Changes from review
  27. 27. © 2021 Thoughtworks | Confidential Deep-dive Questions on Process ● Should model validator be from a different team from model developer? ● Where do Model Cards live? ● Who updates/maintains the cards? ● Does the validator need to fully reproduce the model and results? ● How much responsibility is on the Model Developer to explain about the model vs on the Model Owner to ask Qs? ● Who will be responsible for monitoring in live? ● How should governance board be formed? 29
  28. 28. © 2021 Thoughtworks | Confidential 30 Model cards will tend to follow lead of reference examples Detailed examples are more burden to produce and read Tricky Questions: How Much Detail? ● Reference examples will have a big impact on what documentation really gets produced ● Examples show developers how much and what kind of detail is expected Photo by Magda Ehlers on Pexels
  29. 29. © 2021 Thoughtworks | Confidential Test the Process ● Can’t prove workability in a vacuum ● Pick some cases and work it through together 31 Photo from WikiImages on Pixabay
  30. 30. © 2021 Thoughtworks | Confidential Tradeoffs and Opportunities 32 ● Not enough checks and you fail to surface risks ● Irrelevant/inappropriate checks slow you down ● Process can encourage best practice - this is an opportunity
  31. 31. © 2021 Thoughtworks | Confidential Hidden Risks 33 Known Risks Overlooked Risks Financial Risk Legal Risk Reputation Risk Quality Risk Ethical Risk Delivery Risk Regulatory Risk
  32. 32. © 2021 Thoughtworks | Confidential Credit Assessment 34 Known Risks Overlooked Risks Financial Risks First Order Gender Bias Indirect Gender Bias? (e.g. from occupation) Actually not overlooked Similar situation seemed to have happened with AppleCard in 2019 but investigation found no bias www.bbc.co.uk/news/business-50432634 www.theverge.com/2021/3/23/22347127/goldman- sachs-apple-card-no-gender-discrimination Reputation Risk from Customer Confusion
  33. 33. © 2021 Thoughtworks | Confidential Summary Photo by Romain Dancre from Unsplash ● ML Governance is multi-faceted and can be confusing ● Simple template Process based around model cards and defined roles ● Model Owner role key to Risk Management ● Needs to be shaped to your team/s

Notas del editor

  • Intro speakers
  • Ryan:

    Lots of discussions of ML Governance never really get into details and can be confusing. We’re going to break through the confusion and tell you how you can really do something about ML Governance at the level of a Data Science team. This is going to involve talking about Documentation. But it’s good documentation - the kind of documentation that can really help you out if you use it wisely. So let’s get started.
  • First up, why is this topic so confusing? Why do so many people feel like they don’t even know what ML Governance is?
  • The fact is that many teams right now have very little governance. This is understandable as technologists have a delivery focus which means teams are biased towards building solutions now and worrying about risks later. To get to a better place on governance the burden can’t all be on tech. It has to be a collaborative process. This leaves techies a bit nervous because nobody really knows what is needed and there’s a fear of some bureacrat coming in and telling the team what they can and can’t do. The nervousness is amplified by confusion about what is really needed.
  • It’s understandable that we get confused about ML Governance. There’s lots of different aspects and it’s easy to get lost in the conversations. There’s hot topics like Responsible AI that gets alot of attention. And this is an MLOps meetup so we obviously love MLOps. But each of these is just a part of ML Governance.
  • We could roughly cluster different aspects of ML Governance under Ethics and Principles, Tech Practices and MLOps and Management and Frameworks. This helps us get a better picture of ML Governance but it only takes us so far.
  • We will get into the details very shortly. First just to re-emphasise that this is not a Responsible AI talk. Ethics and Responsible AI are important. But they’re only part of the conversation. We want to talk about the boring side of ML Governance.
  • Meissane:

    We also need to think about the relative weight or perceived importance of the topics under ML Governance. When people think about ML Governance they tend to think about something like this with Ethics a big part of it and the documentation and sign-offs stuff falling under bureaucracy that they’d rather not get involved with.
  • We want to shift this thinking and instead think about documentation and peer review as parts of best practice that should be reinforced by good governance. And sign-offs shouldn’t be about having to beg some bureacrat to tick a box next to your model. It should be about positioning risk trade-off decisions with the right people.
  • It’s also worth understanding how ML Governance relates to other types of Governance and especially Data Governance. This is important because there’s important overlaps between ML and Data Governance.

    The main areas of overlap are in documenting datasets. Where the dataset comes from, what it means, how it gets updated and its known limitations.

    This is needed for both Data and ML Governance and ideally it would fall under Data Governance so that Data Scientists can leverage it.
    It is not only needed for producing ML models but also for doing:
    Data Analysis
    Analytics dashboards
    generally asking questions of data.

    Data Labelling likewise has a lot of value for Data Analysis and non-ML applications.

    Data Lineage is about tracking changes to the data over time:
    This can be important for ML training pipelines and reproducibility.
    Once again there are also data analytics use cases where data lineage can be important.
    Sometimes this can be a requirement of auditors.
  • Meissane:

    So let’s talk about how to come up with a ML governance process.

    Here are the kind of questions we want to talk about to set up ML Governance.

    READ QUESTIONS

  • Processes can easily be over engineered. Manual checks slow down the process and can be harder to follow and keep track which can affect the team morale.

    Simply referring to a it as ‘Best Practice’ isn’t enough to make us trust it. It might just be bureaucracy rebranded.

    For the team to feel comfortable with the process, it has to be relevant and appropriate.
  • This is a bit of a side note but the term ‘bureacracy’ literally means ‘rule by desks’.

    When you are constrained by bureaucracy it does feel like you are at the mercy of something unthinking.

    However, it’s important to note that what this feeling arises when rules don’t work well for your case. Like you can’t get done what you want to get done because somebody has made a rule without thinking about what you want to do. Processes and rules are not the problem.

    The problem is when the rules and processes don’t fit with what needs to be done.
  • Ryan: So now we know what ML Governance is about. How do we make it happen?
  • Ryan:

    This slide is going to look super simple. It is not the whole answer to ML Governance. But it is a starting point that we’ll use in this presentation.

    You can think of this slide as a flexible template for a process that can be adapted for different organisations and teams.

    The flow hinges around certain key documents that you can see here named as the Model Card, the Model Validation Report and Model owner approval. But the flow is not just about producing good documentation. It’s also about facilitating informed decision-making and positioning decisions with the most appropriate people.
  • Ryan:

    The model developer produces a model card which documents the purpose of a model, its design, what data it uses, what risks they can see around it and advice on how the model should and should not be used. This is checked by the model validator.
  • Ryan:

    The model validator also checks that the model is reproducible and that the code and documentation is clear. There might be some back and forth here. Think of it like a pull request review process. There might be a separate model validation report from this or it might be a section that gets added to the model card or it might even be a link to a pull request with structured comments.

  • Ryan:

    The next step is the model owner. The model owner is also looking for clarity about how the model works and how it should be used and its limitations. But the model owner probably won’t be technical so this needs to be explained at a different level. This might result in some more back and forth on the documentation. Most importantly the model owner needs to know about any risks and trade-offs associated with the model as they will be responsible at a business level for the model within the business product or process in which it is to be used.
  • Within this process there might also be an escalation route to an oversight board. Not every org will have an oversight board but if you do then they would become involved in cases where a model is identified as high risk, triggering a deeper review with more parties. Factors that could trigger an oversight review:

    Use of sensitive data or attributes (PII, protected attributes such as gender etc.)
    Models making decisions with a potential negative impact on an individual or entity
    Issues arising from ISRM security review
    Serious concerns about quality of the model and monitoring (e.g. live data not well known and unable to perform desired testing and monitoring)
  • An oversight board might also lead a periodic review process. Perhaps you do a review every year or at some other frequency. This might be for an external auditor though it’s better to think of this process for non-regulated industries first. We can think of regulated industries separately. You might also do an internal audit to check that documentation is all up to a similar standard. You might also use the information to look for patterns and opportunities within the org. There’s a lot to understand here so let’s try to make it more concrete. We’ll get into the details of model cards and understand what a model validator or model owner would be looking for. But first let’s understand the roles in more detail.
  • Meissane:

    In ML governance we want to place right kind of questions and decisions to sit with the appropriate roles.

    Too often what we’re seeing is that Data Scientists are assumed to have already assessed risks and dealt with them, so that product management and other business managers don’t have to think about them.

    This is not appropriate as Data Scientists are not empowered to make decisions about what risks are worth taking and are not able to simply make risks go away.

    Data Scientists are in a position to develop models, to explain what they do and make the risks and trade-offs of models clear. Data Scientists are also in a position to advise on what monitoring will be appropriate for running models in production.
  • There may be more than one model validator with different intentions.
    Assumption is that a model validator will be a fellow data scientist. This is necessary in order to check the robustness of the development process.

    But there may also be some validation from an ML Engineer or Support Engineer or similar to ensure that they know all the background to monitor the model in live.

    Ideally the Model Developer and an ML Engineer will work together to put together a Deployment and Monitoring plan. That also needs to be part of the extended Model Card as the model owner needs to know about it.
    They need to know about any deployment risks and what kind of monitoring is achievable as it is part of the overall risk profile.
  • There has been a lot of discussion about how best to document ML models. We’ve listed the most notable approaches to ML Governance documentation here.

    Model cards are a checklist that google is trying to popularise. They’re focused on overviews and design trade-offs of models.
    Fairness and Limitation tradeoffs
    https://drive.google.com/file/d/1QvwWNfFoweGVjsXF3DXzcrCnz-mx-Lha/preview

    Datasheets are a kind of checklist for datasets, not for models. So they’re complimentary.

    So model cards and datasheets both started from a position of reducing misuse, mistakes and bias. Reproducibility checklists started from a different angle. The motivation for reproducibility checklists was more about ensuring the robustness of the results being reported for ML models, especially in research papers.

    Another angle for checklists is production readiness. ML Test Scores for Production Readiness address deployment and infrastructure and also elements aimed at the ML model such as ensuring that the code is reviewed and in git and that hyperparameters are tuned and that the model chosen is as simple as it can be without loss of performance.

    With so many different angles to ML documentation, it’s clear that we need to cover a mixture of different concerns in document models. We might choose to do this in one checklist with a range of different sections or we could use a variety of checklists. The ML Cards for D/MLOps Governance link at the bottom of the slide here suggests using separate cards or checklists for different concerns and offers lots of suggestions for questions to include in the checklists.
  • We should now get into more detail on at least one of these checklist approaches. This will help us picture the idea more clearly. Google’s model cards probably the easiest to explain as google has done a lot of work to try to popularise the idea.

    Model cards were proposed by google in a research paper. Added toolkit and google vision face and object detection examples. READ SLIDE
  • Ryan:

    So that’s model cards. That’s the central piece in the process that we talked about before. Actually you could simply extend the model card concept and treat the three documents from this slide as one big model card. Maybe the model validator just provides feedback that updates the model card. And model owner approval could be recorded on the model card.
  • Ryan:
    This can sound easy when you talk about it in a presentation. The difficult thing is making it work for a particular team. There are lots of difficult questions you hit when you try to introduce a process like this in a real team.
    READ QUESTIONS
    Answering these questions tends to depend a lot on the context of the team and organisation. You have to talk to people and figure out what everyone will be comfortable with.
  • Ryan:
    Making the process work for a team isn’t just about talking to people either. There’s also documentation that shows people what the process is about and that’s super important.
    There should be reference examples for the documentation - example model cards that show models that make sense for the team. Reference examples will have a big impact on what documentation really gets produced because they show developers what kind of detail is expected.
  • Talking to people and producing reference documentation is also not enough. You should test out a new process and get feedback and adjust it. I would say adjust it until it is proven but really you can keep adjusting it forever as it can be a living process.
  • This is just a small piece of general advice about shaping a governance process. You somehow have to decide about how much documentation detail is too much and how many sign-offs are too many. There is no general right answer. Firstly you have to look at your risks and get a sense for what realistically might go wrong and what the implications could be. Then you should work with your team and shape the process together. This ensures everyone feels included and buys in to the process
  • We’re coming to the end of the presentation now so we want to leave with you with a key thought. ML Governance is about lots of things like best practice and communication and so on but for many organisations the really big thing they need to tackle is risk management.

    Here’s a useful picture to keep in mind for risk management. We have to be wary of doing our risk assessments in a superficial way. It’s tempting to focus on specific risks or specific types of risk and then not really look for others. The format of the documentation should help practitioners go through risks in a methodical and balanced way. Otherwise you get bitten.
  • Let’s make this concrete by looking at a famous case of getting bitten by risks in using ML. There are lots of these but one that illustrates the point well is when AppleCard launched in 2019 and its credit assessments were accused of gender bias. Lots of high profile people were critical including Steve Wozniak and David Heinemeier Hanson. The credit assessment service was operated by Goldman Sachs and they were quick to say that they were not using gender as an attribute. So then there was speculation that maybe gender was entering indirectly through other attributes. This could happen as some occupations have big gender bias. In fact an investigation from New York State Department for Financial Services found no gender bias. The problem was actually that people didn’t understand the logic. There were complaints that female spouses were getting lower limits and this was questioned on the basis of shared assets and income. But credit histories are not shared and that was actually part of the algorithm. Where the New York State Department for Financial Services did criticise Goldman Sachs though was on communication and customer response. Goldman had no way to respond to all these complaints and wasn’t able at the time to explain why the credit scores were coming out the way that they were. You can imagine this might have been overlooked or just not prioritised due to the rush to get the AppleCard service live.
  • Frist two bullets Ryan. Last two Meissane.

×